2008-03-05 08:36:26

by Jan Beulich

[permalink] [raw]
Subject: [PATCH] x86: fix typo(?) in step.c

TIF_DEBUGCTLMSR has no meaning in the actual MSR...

Signed-off-by: Jan Beulich <[email protected]>

---
arch/x86/kernel/step.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.25-rc4/arch/x86/kernel/step.c 2008-03-05 09:26:15.000000000 +0100
+++ 2.6.25-rc4-x86-step-typo/arch/x86/kernel/step.c 2008-03-04 10:15:40.000000000 +0100
@@ -166,7 +166,7 @@ static void enable_step(struct task_stru
child->thread.debugctlmsr | DEBUGCTLMSR_BTF);
} else {
write_debugctlmsr(child,
- child->thread.debugctlmsr & ~TIF_DEBUGCTLMSR);
+ child->thread.debugctlmsr & ~DEBUGCTLMSR_BTF);

if (!child->thread.debugctlmsr)
clear_tsk_thread_flag(child, TIF_DEBUGCTLMSR);
@@ -189,7 +189,7 @@ void user_disable_single_step(struct tas
* Make sure block stepping (BTF) is disabled.
*/
write_debugctlmsr(child,
- child->thread.debugctlmsr & ~TIF_DEBUGCTLMSR);
+ child->thread.debugctlmsr & ~DEBUGCTLMSR_BTF);

if (!child->thread.debugctlmsr)
clear_tsk_thread_flag(child, TIF_DEBUGCTLMSR);



2008-03-05 13:42:27

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: fix typo(?) in step.c


* Jan Beulich <[email protected]> wrote:

> TIF_DEBUGCTLMSR has no meaning in the actual MSR...

thanks, applied. The effect of this bug is that block-stepping is not
disabled ... [~TIF_DEBUGCTLMSR masks out bit 25, while in the MSR we
want to mask out bit 1]

Roland - i guess this means block-stepping (a new ptrace feature in .25)
is not particularly well-tested. Do you have any standalone testcases
that could be run?

Ingo

2008-03-06 07:52:37

by Roland McGrath

[permalink] [raw]
Subject: Re: [PATCH] x86: fix typo(?) in step.c

> Roland - i guess this means block-stepping (a new ptrace feature in .25)
> is not particularly well-tested. Do you have any standalone testcases
> that could be run?

I'm pretty sure that noone really uses it yet. The test I used when I
originally wrote the feature is in the ptrace-tests suite. (See
http://sourceware.org/systemtap/wiki/utrace/tests about that suite.)
I haven't particularly tested it since then, if it got broken later.

http://sources.redhat.com/cgi-bin/cvsweb.cgi/tests/ptrace-tests/tests/block-step.c?cvsroot=systemtap

Be sure to compile with current kernel-headers, or hand-tweak to define
PTRACE_SINGLEBLOCK. Use -std=gnu99 -D_GNU_SOURCE.

The bogon came in commit eee3af4a2c83a97fff107ddc445d9df6fded9ce4,
the introduction of the ptrace BTS stuff. Sorry I did not scour and
cite every problem in that patch, since I had NAK'd the entire thing
as needing more careful review and incremental introduction after 2.6.25.

As I said then, one of my concerns was with the low-level tweaks not yet
sufficiently baked, independent from my reservations about the ptrace
feature. Your #if'ing out of the user ABI additions for 2.6.25 does
nothing to remove the unknown new risks from all the tweaks with fingers in
the low-level arch stuff. This is the sort of thing I was concerned about.
(And this one is easy.)

The block-step test only tested that PTRACE_SINGLEBLOCK worked right.
I just souped it up to also test that PTRACE_SINGLESTEP still works
immediately afterwards. This still does not show any problem from this
bug. The case that would be broken by it is rather more arcane. I
haven't worked out the test case that fails with the bogon.


Thanks,
Roland

2008-03-06 11:28:54

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: fix typo(?) in step.c


* Roland McGrath <[email protected]> wrote:

> The bogon came in commit eee3af4a2c83a97fff107ddc445d9df6fded9ce4, the
> introduction of the ptrace BTS stuff. Sorry I did not scour and cite
> every problem in that patch, since I had NAK'd the entire thing as
> needing more careful review and incremental introduction after 2.6.25.

note that in -rc4 all those BTS ptrace extensions are disabled, see:

| commit b4ef95de00be4c2c30feccf607a45093c8c118b7
| Author: Ingo Molnar <[email protected]>
| Date: Tue Feb 26 09:40:27 2008 +0100
|
| x86: disable BTS ptrace extensions for now

Ingo

2008-03-06 11:34:40

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: fix typo(?) in step.c


* Roland McGrath <[email protected]> wrote:

> The block-step test only tested that PTRACE_SINGLEBLOCK worked right.
> I just souped it up to also test that PTRACE_SINGLESTEP still works
> immediately afterwards. This still does not show any problem from
> this bug. The case that would be broken by it is rather more arcane.
> I haven't worked out the test case that fails with the bogon.

my interpretation of the bug would be that we fail to mask out the
block-step MSR bit [because we mask out bit 25 instead of bit 1], and
hence the bug would cause that MSR bit to stay enabled in other tasks
too.

So in theory the bug should manifest itself as block-step mode never
clearing itself, once activated. (but this would never leak into other
tasks because we've got the thread.debugctlmsr abstraction that protects
them)

Ingo

2008-03-06 12:03:52

by Roland McGrath

[permalink] [raw]
Subject: Re: [PATCH] x86: fix typo(?) in step.c

> * Roland McGrath <[email protected]> wrote:
>
> > The bogon came in commit eee3af4a2c83a97fff107ddc445d9df6fded9ce4, the
> > introduction of the ptrace BTS stuff. Sorry I did not scour and cite
> > every problem in that patch, since I had NAK'd the entire thing as
> > needing more careful review and incremental introduction after 2.6.25.
>
> note that in -rc4 all those BTS ptrace extensions are disabled, see:

I know. That completely misses the point I just made:

As I said then, one of my concerns was with the low-level tweaks not yet
sufficiently baked, independent from my reservations about the ptrace
feature. Your #if'ing out of the user ABI additions for 2.6.25 does
nothing to remove the unknown new risks from all the tweaks with fingers in
the low-level arch stuff. This is the sort of thing I was concerned about.

You didn't revert the parts that ever could have caused problems for anyone
except those using the new ptrace extensions, i.e. changes to step.c,
context switch, whatever else was touched we've lost track of now. I keep
saying that those are not baked, 100% independent of the ptrace feature.
You don't seem to be hearing me.


Thanks,
Roland

2008-03-06 12:17:33

by Roland McGrath

[permalink] [raw]
Subject: Re: [PATCH] x86: fix typo(?) in step.c

> my interpretation of the bug would be that we fail to mask out the
> block-step MSR bit [because we mask out bit 25 instead of bit 1], and
> hence the bug would cause that MSR bit to stay enabled in other tasks
> too.

The wrong bit is in calls to write_debugctlmsr, only used when setting up a
thread to step. It does not affect context switch, so it would never have
an effect on other tasks as you suggest here.

> So in theory the bug should manifest itself as block-step mode never
> clearing itself, once activated.

That doesn't happen in the trivial sense of "never", because in the normal
case an actual block-step exception happens and that makes the hardware
clear BTF from the MSR (as well as TF from eflags). So it would only come
up in a more obscure case. That is, you set up for block-step but didn't
actually finish the user-mode instruction block. e.g. interrupted by a
signal or faulting instruction. The child stops again but not by SIGTRAP,
and next time you don't block-step it. Then, the BTF bit stays set in
thread.debugctlmsr and gets switched back in when the child runs again.
If you then resume with single-step instead, it will block-step because
BTF is set, but you wanted instruction-step. Like I said, I didn't
produce a case that behaved that way. I may be overlooking something.
But that's the scenario I imagine.

> (but this would never leak into other tasks because we've got the
> thread.debugctlmsr abstraction that protects them)

Correct.


Thanks,
Roland

2008-03-06 13:11:52

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: fix typo(?) in step.c


* Roland McGrath <[email protected]> wrote:

> I know. That completely misses the point I just made:
>
> As I said then, one of my concerns was with the low-level tweaks
> not yet sufficiently baked, independent from my reservations about
> the ptrace feature. Your #if'ing out of the user ABI additions for
> 2.6.25 does nothing to remove the unknown new risks from all the
> tweaks with fingers in the low-level arch stuff. This is the sort
> of thing I was concerned about.
>
> You didn't revert the parts that ever could have caused problems for
> anyone except those using the new ptrace extensions, i.e. changes to
> step.c, context switch, whatever else was touched we've lost track of
> now. I keep saying that those are not baked, 100% independent of the
> ptrace feature. You don't seem to be hearing me.

well the issue is that both regset and bts had regressions, so the
safest was to do the minimal step of undoing any externally visible
changes. Feel free to send a reverter patch for the other lowlevel bts
bits as well.

Ingo

2008-03-06 13:13:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: fix typo(?) in step.c


* Roland McGrath <[email protected]> wrote:

> > So in theory the bug should manifest itself as block-step mode never
> > clearing itself, once activated.
>
> That doesn't happen in the trivial sense of "never", because in the
> normal case an actual block-step exception happens and that makes the
> hardware clear BTF from the MSR (as well as TF from eflags). So it
> would only come up in a more obscure case. [...]

ah, i missed that detail - this indeed makes this code a lot less
dangerous.

/me stops worrying

Ingo

2008-03-10 11:52:52

by Jan Beulich

[permalink] [raw]
Subject: Re: [PATCH] x86: fix typo(?) in step.c

>>> Ingo Molnar <[email protected]> 06.03.08 14:11 >>>
>
>* Roland McGrath <[email protected]> wrote:
>
>> I know. That completely misses the point I just made:
>>
>> As I said then, one of my concerns was with the low-level tweaks
>> not yet sufficiently baked, independent from my reservations about
>> the ptrace feature. Your #if'ing out of the user ABI additions for
>> 2.6.25 does nothing to remove the unknown new risks from all the
>> tweaks with fingers in the low-level arch stuff. This is the sort
>> of thing I was concerned about.
>>
>> You didn't revert the parts that ever could have caused problems for
>> anyone except those using the new ptrace extensions, i.e. changes to
>> step.c, context switch, whatever else was touched we've lost track of
>> now. I keep saying that those are not baked, 100% independent of the
>> ptrace feature. You don't seem to be hearing me.
>
>well the issue is that both regset and bts had regressions, so the
>safest was to do the minimal step of undoing any externally visible
>changes. Feel free to send a reverter patch for the other lowlevel bts
>bits as well.

So, is this going to be fully reverted, or is it worth pointing out/fixing
other issues? The thing I'm recognizing right now is that
eee3af4a2c83a97fff107ddc445d9df6fded9ce4 made the writes to
DebugCtlMSR unconditional, which means any attempt to do
debugging on i[345]86 will ultimately cause the kernel to oops. All of
that stuff should really depend on CONFIG_X86_DEBUGCTLMSR...

Jan

2008-03-11 03:44:21

by Roland McGrath

[permalink] [raw]
Subject: Re: [PATCH] x86: fix typo(?) in step.c

> So, is this going to be fully reverted, or is it worth pointing out/fixing
> other issues? The thing I'm recognizing right now is that
> eee3af4a2c83a97fff107ddc445d9df6fded9ce4 made the writes to
> DebugCtlMSR unconditional, which means any attempt to do
> debugging on i[345]86 will ultimately cause the kernel to oops. All of
> that stuff should really depend on CONFIG_X86_DEBUGCTLMSR...

I think it would be wise to excise all the BTS-related additions until
after 2.6.25. But I am too swamped already and not planning to do
anything about this myself. But, that said, FWIW it does not look to
me like debugctlmsr will ever be written on hardware that doesn't
support it. The stores are all only enabled if TIF_DEBUGCTLMSR gets
set and thread.debugctlmsr is nonzero. That can't be set by ptrace
unless arch_has_block_step() returns true.


Thanks,
Roland