LinuxLists.cc - [Patch 00/11] Hardware Breakpoint interfaces

2009-04-07 06:35:18

Subject: [Patch 00/11] Hardware Breakpoint interfaces

Hi Alan,
I am sending you the patches with the changes mentioned in the
Changelog below. Please read the patches in conjunction with my replies
sent to your previous comments.

Let me know your thoughts on this.

Changelog
-----------
- Introduce (un)register_user_hw_breakpoint() which would take two parameters -
pointer to 'struct task_struct' and 'struct hw_breakpoint'.
- Change the return value behaviour in ptrace(). Return '0' for write in DR0 -
DR3 (with the exception being when the registered address is not in
user-space), while restricting error returns for write in DR7. Move memory
(de)allocation behaviour for 'struct hw_breakpoint' to ptrace related code and
flush_thread_hw_breakpoint() from unregister_user_hw_breakpoint() code.
- Addition of arch_flush_thread_hw_breakpoint() to zero-out thread->debugreg[n]
registers.
- Consolidate arch-specific kernel/thread update routines into
arch_update_kernel_hw_breakpoints() and arch_update_user_hw_breakpoint()
routines.
- Re-arrange the patchset to enable compilation after application of every
patch in the patchset.
- hw_breakpoint_handler() modified to:
- Unconditionally invoke (*trigger) function call
- Return NOTIFY_DONE only when other bits in DR6 are set or when
breakpoint originates due to an address in user-space
- Do an early return if trap bits are not set and we have entered the handler
because of BT, BS or BD bits in dr6 being set.
- samples/hw_breakpoint/data_breakpoint.c modified to accept module
parameters

The patchset is based on commit f0e36dc28173b65df2216dfae7109645d97a1bd9 of -tip
tree.

Thanks,
K.Prasad

2009-04-16 21:20:11

by Alan Stern

[permalink] [raw]

Subject: Re: [Patch 00/11] Hardware Breakpoint interfaces

On Tue, 7 Apr 2009, K.Prasad wrote:

> Hi Alan,
> I am sending you the patches with the changes mentioned in the
> Changelog below. Please read the patches in conjunction with my replies
> sent to your previous comments.
>
> Let me know your thoughts on this.

Sorry to take so long on this... lots of other things keep cropping up.

Mostly it looks okay; I noticed only a few problems. Comments inline
below.

Alan Stern

> --- arch/x86/include/asm/processor.h.orig
> +++ arch/x86/include/asm/processor.h
> @@ -429,8 +430,11 @@ struct thread_struct {
> unsigned long debugreg1;
> unsigned long debugreg2;
> unsigned long debugreg3;
> + unsigned long debugreg[HB_NUM];

You forgot to get rid of debugreg1 - debugreg3!

> +static int validate_settings(struct hw_breakpoint *bp, struct task_struct *tsk)
> +{
> + int ret;
> +
> + if (!bp)
> + return -EINVAL;

Is this test really needed?

> +
> + ret = arch_validate_hwbkpt_settings(bp, tsk);
> + if (ret < 0)
> + goto err;
> +
> + /* Check that the virtual address is in the proper range */
> + if (tsk) {
> + if (!arch_check_va_in_userspace(bp->info.address, bp->info.len))
> + return -EFAULT;
> + } else {
> + if (!arch_check_va_in_kernelspace(bp->info.address,
> + bp->info.len))
> + return -EFAULT;
> + }
> + err:
> + return ret;
> +}

At this point, almost nothing in the routine is arch-independent. You
might as well move everything into the arch-specific code and
eliminate the validate_settings() routine.

> +void unregister_kernel_hw_breakpoint(struct hw_breakpoint *bp)
> +{
> + int i, j;
> +
> + mutex_lock(&hw_breakpoint_mutex);
> +
> + /* Find the 'bp' in our list of breakpoints for kernel */
> + for (i = hbp_kernel_pos; i < HB_NUM; i++)
> + if (bp == hbp_kernel[i])
> + break;
> +
> + /* Check if we did not find a match for 'bp'. If so return early */
> + if (i == HB_NUM) {
> + mutex_unlock(&hw_breakpoint_mutex);
> + return;
> + }
> +
> + /*
> + * We'll shift the breakpoints one-level above to compact if
> + * unregistration creates a hole
> + */
> + for (j = i; j > hbp_kernel_pos; j--)
> + hbp_kernel[j] = hbp_kernel[j-1];
> +
> + hbp_kernel[hbp_kernel_pos] = 0;

s/0/NULL/

> + on_each_cpu(arch_update_kernel_hw_breakpoints, NULL, 0);
> + hbp_kernel_pos++;

Don't you want to increment hbp_kernel_pos _before_ calling
on_each_cpu()? Otherwise you're telling the other CPUs to install an
empty breakpoint.

> +/* Unmasked kernel DR7 value */
> +static unsigned long kdr7;
> +static const unsigned long kdr7_masks[HB_NUM] = {
> + 0xffff00ff, /* Same for 3, 2, 1, 0 */
> + 0xfff000fc, /* Same for 3, 2, 1 */
> + 0xff0000f0, /* Same for 3, 2 */
> + 0xf00f00c0 /* LEN3, R/W3, G3, L3 */
> +};

It looks like you took out the first line of assignments. Isn't
kdr7_masks supposed to contain HB_NUM + 1 elements? Not to mention
that reordering the lines has mixed up the comments.

> +void arch_update_kernel_hw_breakpoints(void *unused)
> +{
> + struct hw_breakpoint *bp;
> + unsigned long dr7;
> + int i;
> +
> + /* Check if there is nothing to update */
> + if (hbp_kernel_pos == HB_NUM)
> + return;

This is wrong; you have to set DR7 to 0 first. Might as well leave
out this test and just go through the whole routine. The "for" loop
will be skipped entirely if hbp_kernel_pos == HB_NUM, so you don't
save much by returning early.

Ah, now I see this is why you removed a line from kdr7_masks.

> +
> + get_debugreg(dr7, 7);
> +
> + /* Don't allow debug exceptions while we update the registers */
> + set_debugreg(0UL, 7);
> +
> + /* Clear all kernel-space bits in kdr7 and dr7 before we set them */
> + kdr7 &= ~kdr7_masks[hbp_kernel_pos];
> + dr7 &= ~kdr7_masks[hbp_kernel_pos];
> +
> + for (i = hbp_kernel_pos; i < HB_NUM; i++) {
> + bp = hbp_kernel[i];
> + if (bp) {
> + kdr7 |= encode_dr7(i, bp->info.len, bp->info.type);
> + set_debugreg(hbp_kernel[i]->info.address, i);
> + }
> + }
> +
> + dr7 |= kdr7;
> +
> + /* No need to set DR6 */
> + set_debugreg(dr7, 7);
> +}

> +int __kprobes hw_breakpoint_handler(struct die_args *args)
> +{
> + int i, rc = NOTIFY_STOP;
> + struct hw_breakpoint *bp;
> + /* The DR6 value is stored in args->err */
> + unsigned long dr7, dr6 = args->err;
> +
> + /*
> + * We are here because BT, BS or BD bit is set and no trap bits are set
> + * in dr6 register. Do an early return
> + */

I don't like such comments. It says that BT, BS, or BD is set, but
they don't necessarily have to be. You should say:

/* Do an early return if no trap bits are set in DR6 */

> + if ((dr6 & (DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3)) == 0)
> + return NOTIFY_DONE;
> +
> + get_debugreg(dr7, 7);
> +
> + /* Disable breakpoints during exception handling */
> + set_debugreg(0UL, 7);
> + /*
> + * Assert that local interrupts are disabled
> + * Reset the DRn bits in the virtualized register value.
> + * The ptrace trigger routine will add in whatever is needed.
> + */
> + current->thread.debugreg6 &= ~(DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3);
> +
> + /* Lazy debug register switching */
> + if (per_cpu(last_debugged_task, get_cpu()) != current) {
> + switch_to_none_hw_breakpoint();
> + put_cpu_no_resched();
> + }

Once again, the get_debugreg and set_debugreg statements above should
go here instead.

> +
> + /* Handle all the breakpoints that were triggered */
> + for (i = 0; i < HB_NUM; ++i) {
> + if (likely(!(dr6 & (DR_TRAP0 << i))))
> + continue;
> + /*
> + * Find the corresponding hw_breakpoint structure and
> + * invoke its triggered callback.
> + */
> + if (i >= hbp_kernel_pos)
> + bp = hbp_kernel[i];
> + else {
> + bp = current->thread.hbp[i];
> + if (!bp) {
> + /* False alarm due to lazy DR switching */
> + continue;
> + }
> + }
> + (bp->triggered)(bp, args->regs);
> + /*
> + * We have more work to do (send signals) if the breakpoint is
> + * triggered due to user-space address, or if exceptions other
> + * than breakpoint (such as single-stepping) are triggered.
> + */
> + if (arch_check_va_in_userspace(bp->info.address, bp->info.len))
> + rc = NOTIFY_DONE;

This comment and test are more complicated than they need to be. Just
put "rc = NOTIFY_DONE" (with a comment) in the "else" clause above.

> + }
> + if (dr6 & ~(DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3))
> + rc = NOTIFY_DONE;
> + set_debugreg(dr7, 7);
> + return rc;
> +}

> dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
> {
> struct task_struct *tsk = current;
> - unsigned long condition;
> + unsigned long dr6;
> int si_code;
>
> - get_debugreg(condition, 6);
> + get_debugreg(dr6, 6);
> + set_debugreg(0, 6); /* DR6 may or may not be cleared by the CPU */
>
> /* Catch kmemcheck conditions first of all! */
> - if (condition & DR_STEP && kmemcheck_trap(regs))
> + if ((dr6 & DR_STEP && kmemcheck_trap(regs)) &&
> + !(dr6 & (DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3)))
> return;

Should this test go before the "set_debugreg(0, 6)" statement?

2009-04-17 03:13:14

by K.Prasad

[permalink] [raw]

Subject: Re: [Patch 00/11] Hardware Breakpoint interfaces

On Thu, Apr 16, 2009 at 05:19:54PM -0400, Alan Stern wrote:
> On Tue, 7 Apr 2009, K.Prasad wrote:
>
> > Hi Alan,
> > I am sending you the patches with the changes mentioned in the
> > Changelog below. Please read the patches in conjunction with my replies
> > sent to your previous comments.
> >
> > Let me know your thoughts on this.
>
> Sorry to take so long on this... lots of other things keep cropping up.
>
> Mostly it looks okay; I noticed only a few problems. Comments inline
> below.
>
> Alan Stern
>

Thanks for reviewing it....hope that the code quickly morphs into a form
that is acceptable to most.

> > --- arch/x86/include/asm/processor.h.orig
> > +++ arch/x86/include/asm/processor.h
> > @@ -429,8 +430,11 @@ struct thread_struct {
> > unsigned long debugreg1;
> > unsigned long debugreg2;
> > unsigned long debugreg3;
> > + unsigned long debugreg[HB_NUM];
>
> You forgot to get rid of debugreg1 - debugreg3!
>
>

After the patch re-arrangement, they are removed through "[Patch
06/11]". This is to enable compilation of the main tree after
application of every patch.

> > +static int validate_settings(struct hw_breakpoint *bp, struct task_struct *tsk)
> > +{
> > + int ret;
> > +
> > + if (!bp)
> > + return -EINVAL;
>
> Is this test really needed?
>

It protects us from any rogue requests which has a NULL bp pointer.
Given that the register_<kernel/user>_hw_breakpoint() interfaces are
exported, I think it keeps the code safe from poorly written modules.

And as I just see, the second parameter 'tsk' to validate_settings is
unused throughout. I will remove it.

> > +
> > + ret = arch_validate_hwbkpt_settings(bp, tsk);
> > + if (ret < 0)
> > + goto err;
> > +
> > + /* Check that the virtual address is in the proper range */
> > + if (tsk) {
> > + if (!arch_check_va_in_userspace(bp->info.address, bp->info.len))
> > + return -EFAULT;
> > + } else {
> > + if (!arch_check_va_in_kernelspace(bp->info.address,
> > + bp->info.len))
> > + return -EFAULT;
> > + }
> > + err:
> > + return ret;
> > +}
>
> At this point, almost nothing in the routine is arch-independent. You
> might as well move everything into the arch-specific code and
> eliminate the validate_settings() routine.
>

Ok. I will directly invoke arch_validate_hwbkpt_settings() [maybe this
should be called arch_validate_hw_breakpoint()] and move the arch_check_
function call inside arch_validate_.

> > +void unregister_kernel_hw_breakpoint(struct hw_breakpoint *bp)
> > +{
> > + int i, j;
> > +
> > + mutex_lock(&hw_breakpoint_mutex);
> > +
> > + /* Find the 'bp' in our list of breakpoints for kernel */
> > + for (i = hbp_kernel_pos; i < HB_NUM; i++)
> > + if (bp == hbp_kernel[i])
> > + break;
> > +
> > + /* Check if we did not find a match for 'bp'. If so return early */
> > + if (i == HB_NUM) {
> > + mutex_unlock(&hw_breakpoint_mutex);
> > + return;
> > + }
> > +
> > + /*
> > + * We'll shift the breakpoints one-level above to compact if
> > + * unregistration creates a hole
> > + */
> > + for (j = i; j > hbp_kernel_pos; j--)
> > + hbp_kernel[j] = hbp_kernel[j-1];
> > +
> > + hbp_kernel[hbp_kernel_pos] = 0;
>
> s/0/NULL/
>

Ok.

> > + on_each_cpu(arch_update_kernel_hw_breakpoints, NULL, 0);
> > + hbp_kernel_pos++;
>
> Don't you want to increment hbp_kernel_pos _before_ calling
> on_each_cpu()? Otherwise you're telling the other CPUs to install an
> empty breakpoint.
>

Incrementing hbp_kernel_pos after arch_update_kernel_hw_breakpoints()
during unregister_kernel_ is of the following significance:

- The 'pos' numbered debug register is reset to 0 through the code
"set_debugreg(hbp_kernel[i]->info.address, i)" where 'i' ranges from
hbp_kernel_pos to HB_NUM. This is necessary during unregister operation.

- The following statements would reset the bits corresponding to
unregistered breakpoint only then.
kdr7 &= ~kdr7_masks[hbp_kernel_pos];
dr7 &= ~kdr7_masks[hbp_kernel_pos];

- Helps keep the arch_update_kernel_hw_breakpoints() code generic enough
to be invoked during register and unregister operations.

> > +/* Unmasked kernel DR7 value */
> > +static unsigned long kdr7;
> > +static const unsigned long kdr7_masks[HB_NUM] = {
> > + 0xffff00ff, /* Same for 3, 2, 1, 0 */
> > + 0xfff000fc, /* Same for 3, 2, 1 */
> > + 0xff0000f0, /* Same for 3, 2 */
> > + 0xf00f00c0 /* LEN3, R/W3, G3, L3 */
> > +};
>
> It looks like you took out the first line of assignments. Isn't
> kdr7_masks supposed to contain HB_NUM + 1 elements? Not to mention
> that reordering the lines has mixed up the comments.
>

The kdr7_masks array is slightly different from your implementation
earlier.

The element zero in the array is 0xffff00ff, which has all bits
corresponding to DRs 0, 1, 2 and 3 'set', and hence the comment.

> > +void arch_update_kernel_hw_breakpoints(void *unused)
> > +{
> > + struct hw_breakpoint *bp;
> > + unsigned long dr7;
> > + int i;
> > +
> > + /* Check if there is nothing to update */
> > + if (hbp_kernel_pos == HB_NUM)
> > + return;
>
> This is wrong; you have to set DR7 to 0 first. Might as well leave
> out this test and just go through the whole routine. The "for" loop
> will be skipped entirely if hbp_kernel_pos == HB_NUM, so you don't
> save much by returning early.
>
> Ah, now I see this is why you removed a line from kdr7_masks.
>

This check is added to help the load_debug_registers() call during
boot-time which is when hbp_kernel_pos = HB_NUM. If we don't return
early, then assignments such as these "kdr7 &= ~kdr7_masks[hbp_kernel_pos];"
will cause array-out-of-bounds. Otherwise we have to circumvent it by
adding a dummy HB_NUM element in kdr7_masks[] array. I would prefer
retaining the "if (hbp_kernel_pos == HB_NUM)" check instead.

> > +
> > + get_debugreg(dr7, 7);
> > +
> > + /* Don't allow debug exceptions while we update the registers */
> > + set_debugreg(0UL, 7);
> > +
> > + /* Clear all kernel-space bits in kdr7 and dr7 before we set them */
> > + kdr7 &= ~kdr7_masks[hbp_kernel_pos];
> > + dr7 &= ~kdr7_masks[hbp_kernel_pos];
> > +
> > + for (i = hbp_kernel_pos; i < HB_NUM; i++) {
> > + bp = hbp_kernel[i];
> > + if (bp) {
> > + kdr7 |= encode_dr7(i, bp->info.len, bp->info.type);
> > + set_debugreg(hbp_kernel[i]->info.address, i);
> > + }
> > + }
> > +
> > + dr7 |= kdr7;
> > +
> > + /* No need to set DR6 */
> > + set_debugreg(dr7, 7);
> > +}
>
>
> > +int __kprobes hw_breakpoint_handler(struct die_args *args)
> > +{
> > + int i, rc = NOTIFY_STOP;
> > + struct hw_breakpoint *bp;
> > + /* The DR6 value is stored in args->err */
> > + unsigned long dr7, dr6 = args->err;
> > +
> > + /*
> > + * We are here because BT, BS or BD bit is set and no trap bits are set
> > + * in dr6 register. Do an early return
> > + */
>
> I don't like such comments. It says that BT, BS, or BD is set, but
> they don't necessarily have to be. You should say:
>
> /* Do an early return if no trap bits are set in DR6 */
>

Since we enter hw_breakpoint_handler() code only if DIE_DEBUG (and not
say DIE_INT3), one or more of these bits would be 'set' when we are in
hw_breakpoint_handler() - BS, BT, BD or Trap bits, and hence the comment.

> > + if ((dr6 & (DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3)) == 0)
> > + return NOTIFY_DONE;
> > +
> > + get_debugreg(dr7, 7);
> > +
> > + /* Disable breakpoints during exception handling */
> > + set_debugreg(0UL, 7);
> > + /*
> > + * Assert that local interrupts are disabled
> > + * Reset the DRn bits in the virtualized register value.
> > + * The ptrace trigger routine will add in whatever is needed.
> > + */
> > + current->thread.debugreg6 &= ~(DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3);
> > +
> > + /* Lazy debug register switching */
> > + if (per_cpu(last_debugged_task, get_cpu()) != current) {
> > + switch_to_none_hw_breakpoint();
> > + put_cpu_no_resched();
> > + }
>
> Once again, the get_debugreg and set_debugreg statements above should
> go here instead.
>

I understood that you don't want to enable *any* breakpoints - even
kernel-space ones when we are in hw_breakpoint_handler() thereby making
it safer. I will change the code in the following iteration of patch.

> > +
> > + /* Handle all the breakpoints that were triggered */
> > + for (i = 0; i < HB_NUM; ++i) {
> > + if (likely(!(dr6 & (DR_TRAP0 << i))))
> > + continue;
> > + /*
> > + * Find the corresponding hw_breakpoint structure and
> > + * invoke its triggered callback.
> > + */
> > + if (i >= hbp_kernel_pos)
> > + bp = hbp_kernel[i];
> > + else {
> > + bp = current->thread.hbp[i];
> > + if (!bp) {
> > + /* False alarm due to lazy DR switching */
> > + continue;
> > + }
> > + }
> > + (bp->triggered)(bp, args->regs);
> > + /*
> > + * We have more work to do (send signals) if the breakpoint is
> > + * triggered due to user-space address, or if exceptions other
> > + * than breakpoint (such as single-stepping) are triggered.
> > + */
> > + if (arch_check_va_in_userspace(bp->info.address, bp->info.len))
> > + rc = NOTIFY_DONE;
>
> This comment and test are more complicated than they need to be. Just
> put "rc = NOTIFY_DONE" (with a comment) in the "else" clause above.
>

Yes, that's much simpler and avoids a double-check. Will do it.

> > + }
> > + if (dr6 & ~(DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3))
> > + rc = NOTIFY_DONE;
> > + set_debugreg(dr7, 7);
> > + return rc;
> > +}
>
>
> > dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
> > {
> > struct task_struct *tsk = current;
> > - unsigned long condition;
> > + unsigned long dr6;
> > int si_code;
> >
> > - get_debugreg(condition, 6);
> > + get_debugreg(dr6, 6);
> > + set_debugreg(0, 6); /* DR6 may or may not be cleared by the CPU */
> >
> > /* Catch kmemcheck conditions first of all! */
> > - if (condition & DR_STEP && kmemcheck_trap(regs))
> > + if ((dr6 & DR_STEP && kmemcheck_trap(regs)) &&
> > + !(dr6 & (DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3)))
> > return;
>
> Should this test go before the "set_debugreg(0, 6)" statement?
>

Either way, it doesn't affect the function "kmemcheck_trap()". I don't
find it reading directly from DR6 anywhere.

On a related node, a patch that modifies dr6 to be passed by reference
to the notifier calls will be included in the next iteration.

Let me know, upon receiving the patch, if you still think that such a
change i.e. resetting bits in dr6 after handling of the corresponding
breakpoint needs to be done.

Thanks,
K.Prasad

2009-04-17 14:37:35

by Alan Stern

[permalink] [raw]

Subject: Re: [Patch 00/11] Hardware Breakpoint interfaces

On Fri, 17 Apr 2009, K.Prasad wrote:

> > > +void unregister_kernel_hw_breakpoint(struct hw_breakpoint *bp)
> > > +{
> > > + int i, j;
> > > +
> > > + mutex_lock(&hw_breakpoint_mutex);
> > > +
> > > + /* Find the 'bp' in our list of breakpoints for kernel */
> > > + for (i = hbp_kernel_pos; i < HB_NUM; i++)
> > > + if (bp == hbp_kernel[i])
> > > + break;
> > > +
> > > + /* Check if we did not find a match for 'bp'. If so return early */
> > > + if (i == HB_NUM) {
> > > + mutex_unlock(&hw_breakpoint_mutex);
> > > + return;
> > > + }
> > > +
> > > + /*
> > > + * We'll shift the breakpoints one-level above to compact if
> > > + * unregistration creates a hole
> > > + */
> > > + for (j = i; j > hbp_kernel_pos; j--)
> > > + hbp_kernel[j] = hbp_kernel[j-1];
> > > +
> > > + hbp_kernel[hbp_kernel_pos] = 0;
> >
> > s/0/NULL/
> >
>
> Ok.
>
> > > + on_each_cpu(arch_update_kernel_hw_breakpoints, NULL, 0);
> > > + hbp_kernel_pos++;
> >
> > Don't you want to increment hbp_kernel_pos _before_ calling
> > on_each_cpu()? Otherwise you're telling the other CPUs to install an
> > empty breakpoint.
> >
>
> Incrementing hbp_kernel_pos after arch_update_kernel_hw_breakpoints()
> during unregister_kernel_ is of the following significance:
>
> - The 'pos' numbered debug register is reset to 0 through the code
> "set_debugreg(hbp_kernel[i]->info.address, i)" where 'i' ranges from
> hbp_kernel_pos to HB_NUM. This is necessary during unregister operation.

It _isn't_ necessary. The contents of that debug register don't matter
at all if it isn't enabled. (And there's nothing special about 0; if
the breakpoint were enabled then you would get a breakpoint interrupt
whenever something tried to access address 0.)

Besides, the "set_debugreg" line of code in
arch_update_kernel_hw_breakpoints doesn't get executed when
hbp_kernel[i] is NULL. So this whole point is moot; the debug register
wouldn't be affected anyway.

> - The following statements would reset the bits corresponding to
> unregistered breakpoint only then.
> kdr7 &= ~kdr7_masks[hbp_kernel_pos];
> dr7 &= ~kdr7_masks[hbp_kernel_pos];

But if hbp_kernel_pos is wrong (i.e., if it hasn't been incremented)
then these statements will set kdr7 and dr7 incorrectly.

> - Helps keep the arch_update_kernel_hw_breakpoints() code generic enough
> to be invoked during register and unregister operations.

This is another moot point. Your code is _wrong_ -- how generic it is
doesn't matter.

> > > +/* Unmasked kernel DR7 value */
> > > +static unsigned long kdr7;
> > > +static const unsigned long kdr7_masks[HB_NUM] = {
> > > + 0xffff00ff, /* Same for 3, 2, 1, 0 */
> > > + 0xfff000fc, /* Same for 3, 2, 1 */
> > > + 0xff0000f0, /* Same for 3, 2 */
> > > + 0xf00f00c0 /* LEN3, R/W3, G3, L3 */
> > > +};
> >
> > It looks like you took out the first line of assignments. Isn't
> > kdr7_masks supposed to contain HB_NUM + 1 elements? Not to mention
> > that reordering the lines has mixed up the comments.
> >
>
> The kdr7_masks array is slightly different from your implementation
> earlier.
>
> The element zero in the array is 0xffff00ff, which has all bits
> corresponding to DRs 0, 1, 2 and 3 'set', and hence the comment.

The comment is "Same for 3, 2, 1, 0". But that is an incomplete
thought -- same as _what_? The answer is: Same as "LEN3, R/W3, G3,
L3". But you have messed up the logical order of the comments by
reversing the lines.

Look at this:

/* LEN3, R/W3, G3, L3 */
/* Same for 3, 2 */
/* Same for 3, 2, 1 */
/* Same for 3, 2, 1, 0 */

That makes sense. For instance, it's clear that the second line means
LEN3, LEN2, R/W3, R/W2, G3, G2, L3, L2. Now look at yours:

/* Same for 3, 2, 1, 0 */
/* Same for 3, 2, 1 */
/* Same for 3, 2 */
/* LEN3, R/W3, G3, L3 */

That doesn't make sense. It's a general fact of human language: If you
reverse the order of a bunch of sentences, you will mess up the
meaning.

> > > +void arch_update_kernel_hw_breakpoints(void *unused)
> > > +{
> > > + struct hw_breakpoint *bp;
> > > + unsigned long dr7;
> > > + int i;
> > > +
> > > + /* Check if there is nothing to update */
> > > + if (hbp_kernel_pos == HB_NUM)
> > > + return;
> >
> > This is wrong; you have to set DR7 to 0 first. Might as well leave
> > out this test and just go through the whole routine. The "for" loop
> > will be skipped entirely if hbp_kernel_pos == HB_NUM, so you don't
> > save much by returning early.
> >
> > Ah, now I see this is why you removed a line from kdr7_masks.
> >
>
> This check is added to help the load_debug_registers() call during
> boot-time which is when hbp_kernel_pos = HB_NUM. If we don't return
> early, then assignments such as these "kdr7 &= ~kdr7_masks[hbp_kernel_pos];"
> will cause array-out-of-bounds. Otherwise we have to circumvent it by
> adding a dummy HB_NUM element in kdr7_masks[] array. I would prefer
> retaining the "if (hbp_kernel_pos == HB_NUM)" check instead.

It's up to you. But consider this:

1. You have replaced a single array element (4 bytes of data)
with a conditional test and jump (probably around 7 bytes
of code, not to mention that the conditional jump will slow
down the CPU even when it isn't taken).

2. If hbp_kernel_pos is equal to HB_NUM then the early return
causes you to skip the line setting kdr7. It will end up
having an incorrect value.

> > > +int __kprobes hw_breakpoint_handler(struct die_args *args)
> > > +{
> > > + int i, rc = NOTIFY_STOP;
> > > + struct hw_breakpoint *bp;
> > > + /* The DR6 value is stored in args->err */
> > > + unsigned long dr7, dr6 = args->err;
> > > +
> > > + /*
> > > + * We are here because BT, BS or BD bit is set and no trap bits are set
> > > + * in dr6 register. Do an early return
> > > + */
> >
> > I don't like such comments. It says that BT, BS, or BD is set, but
> > they don't necessarily have to be. You should say:
> >
> > /* Do an early return if no trap bits are set in DR6 */
> >
>
> Since we enter hw_breakpoint_handler() code only if DIE_DEBUG (and not
> say DIE_INT3), one or more of these bits would be 'set' when we are in
> hw_breakpoint_handler() - BS, BT, BD or Trap bits, and hence the comment.

But the comment is _wrong_. We might be here because one of the trap
bits is set and BT, BS, and BD are clear, whereas the comment says that
can't be the case.

Just read the comment again. It says: "We are here because ... no trap
bits are set".

> > > dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
> > > {
> > > struct task_struct *tsk = current;
> > > - unsigned long condition;
> > > + unsigned long dr6;
> > > int si_code;
> > >
> > > - get_debugreg(condition, 6);
> > > + get_debugreg(dr6, 6);
> > > + set_debugreg(0, 6); /* DR6 may or may not be cleared by the CPU */
> > >
> > > /* Catch kmemcheck conditions first of all! */
> > > - if (condition & DR_STEP && kmemcheck_trap(regs))
> > > + if ((dr6 & DR_STEP && kmemcheck_trap(regs)) &&
> > > + !(dr6 & (DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3)))
> > > return;
> >
> > Should this test go before the "set_debugreg(0, 6)" statement?
> >
>
> Either way, it doesn't affect the function "kmemcheck_trap()". I don't
> find it reading directly from DR6 anywhere.

Sure, it doesn't affect kmemcheck_trap(). But it might affect
something else. What if some other code later on looks at the contents
of DR6?

Incidentally, the parenthese in that test are wrong. It should be:

if ((dr6 & DR_STEP) && kmemcheck_trap(regs) &&
!(dr6 & (DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3)))

> On a related node, a patch that modifies dr6 to be passed by reference
> to the notifier calls will be included in the next iteration.
>
> Let me know, upon receiving the patch, if you still think that such a
> change i.e. resetting bits in dr6 after handling of the corresponding
> breakpoint needs to be done.

At the moment, the only other bits which could be set are BT and BD.
Presumably we don't care about BD. Do you care if BT gets wiped out
before anybody has a chance to see it?

Alan Stern

2009-04-24 05:57:17

by K.Prasad

[permalink] [raw]

Subject: Re: [Patch 00/11] Hardware Breakpoint interfaces

On Fri, Apr 17, 2009 at 10:37:13AM -0400, Alan Stern wrote:
> On Fri, 17 Apr 2009, K.Prasad wrote:
>

Hi Alan,
I have modified the patch by accepting all of your suggestions
excepting one - which I try to explain below.

I will send the patchset containing changes suggested by you and a few
others (have explained that in the changelog).

Kindly review the new code.

> >
> > > > + on_each_cpu(arch_update_kernel_hw_breakpoints, NULL, 0);
> > > > + hbp_kernel_pos++;
> > >
> > > Don't you want to increment hbp_kernel_pos _before_ calling
> > > on_each_cpu()? Otherwise you're telling the other CPUs to install an
> > > empty breakpoint.
> > >
> >
> > Incrementing hbp_kernel_pos after arch_update_kernel_hw_breakpoints()
> > during unregister_kernel_ is of the following significance:
> >
> > - The 'pos' numbered debug register is reset to 0 through the code
> > "set_debugreg(hbp_kernel[i]->info.address, i)" where 'i' ranges from
> > hbp_kernel_pos to HB_NUM. This is necessary during unregister operation.
>
> It _isn't_ necessary. The contents of that debug register don't matter
> at all if it isn't enabled. (And there's nothing special about 0; if
> the breakpoint were enabled then you would get a breakpoint interrupt
> whenever something tried to access address 0.)
>
> Besides, the "set_debugreg" line of code in
> arch_update_kernel_hw_breakpoints doesn't get executed when
> hbp_kernel[i] is NULL. So this whole point is moot; the debug register
> wouldn't be affected anyway.
>
> > - The following statements would reset the bits corresponding to
> > unregistered breakpoint only then.
> > kdr7 &= ~kdr7_masks[hbp_kernel_pos];
> > dr7 &= ~kdr7_masks[hbp_kernel_pos];
>
> But if hbp_kernel_pos is wrong (i.e., if it hasn't been incremented)
> then these statements will set kdr7 and dr7 incorrectly.
>
> > - Helps keep the arch_update_kernel_hw_breakpoints() code generic enough
> > to be invoked during register and unregister operations.
>
> This is another moot point. Your code is _wrong_ -- how generic it is
> doesn't matter.
>
>

The arch_update_kernel_hw_breakpoints() was designed to work like this -
it updates all registers beginning 'hbp_kernel_pos' to (HB_NUM - 1) with
the values stored in hbp_kernel[] array.

When inserting a new breakpoint, hbp_kernel_pos is decremented *before*
invoking arch_update_kernel_hw_breakpoints() so that the new value is
also written onto the physical debug register.

On removal, 'hbp_kernel_pos' is incremented *after*
arch_update_kernel_hw_breakpoints() so that the physical debug registers
i.e. both DR7 and DR<pos> are updated with the changes post removal and
compaction. I'm ready to make changes but don't see where the code
actually goes wrong. Can you explain that?

Thanks,
K.Prasad

2009-04-24 14:16:19

by Alan Stern

[permalink] [raw]

Subject: Re: [Patch 00/11] Hardware Breakpoint interfaces

On Fri, 24 Apr 2009, K.Prasad wrote:

> The arch_update_kernel_hw_breakpoints() was designed to work like this -
> it updates all registers beginning 'hbp_kernel_pos' to (HB_NUM - 1) with
> the values stored in hbp_kernel[] array.
>
> When inserting a new breakpoint, hbp_kernel_pos is decremented *before*
> invoking arch_update_kernel_hw_breakpoints() so that the new value is
> also written onto the physical debug register.
>
> On removal, 'hbp_kernel_pos' is incremented *after*
> arch_update_kernel_hw_breakpoints() so that the physical debug registers
> i.e. both DR7 and DR<pos> are updated with the changes post removal and
> compaction. I'm ready to make changes but don't see where the code
> actually goes wrong. Can you explain that?

I'm sorry; I misread the code in arch_update_kernel_hw_breakpoints().
It isn't actually wrong, and you are correct to increment
hbp_kernel_pos where you do. Your code is different from my original
version, which would update all the debug registers at once instead of
doing the kernel and userspace breakpoints separately -- that's what
confused me.

There is one change you could make to improve the routine, however. In
arch_update_kernel_hw_breakpoints(), the line

kdr7 &= ~kdr7_masks[hbp_kernel_pos];

really should be

kdr7 = 0;

since kdr7 never contains anything other than kernel breakpoint
settings. (You could update the comment in the preceding line as
well.)

Alan Stern

2009-04-24 15:57:33

by K.Prasad

[permalink] [raw]

Subject: Re: [Patch 00/11] Hardware Breakpoint interfaces

On Fri, Apr 24, 2009 at 10:16:07AM -0400, Alan Stern wrote:
> On Fri, 24 Apr 2009, K.Prasad wrote:
>
> > The arch_update_kernel_hw_breakpoints() was designed to work like this -
> > it updates all registers beginning 'hbp_kernel_pos' to (HB_NUM - 1) with
> > the values stored in hbp_kernel[] array.
> >
> > When inserting a new breakpoint, hbp_kernel_pos is decremented *before*
> > invoking arch_update_kernel_hw_breakpoints() so that the new value is
> > also written onto the physical debug register.
> >
> > On removal, 'hbp_kernel_pos' is incremented *after*
> > arch_update_kernel_hw_breakpoints() so that the physical debug registers
> > i.e. both DR7 and DR<pos> are updated with the changes post removal and
> > compaction. I'm ready to make changes but don't see where the code
> > actually goes wrong. Can you explain that?
>
> I'm sorry; I misread the code in arch_update_kernel_hw_breakpoints().
> It isn't actually wrong, and you are correct to increment
> hbp_kernel_pos where you do. Your code is different from my original
> version, which would update all the debug registers at once instead of
> doing the kernel and userspace breakpoints separately -- that's what
> confused me.
>
> There is one change you could make to improve the routine, however. In
> arch_update_kernel_hw_breakpoints(), the line
>
> kdr7 &= ~kdr7_masks[hbp_kernel_pos];
>
> really should be
>
> kdr7 = 0;
>
> since kdr7 never contains anything other than kernel breakpoint
> settings. (You could update the comment in the preceding line as
> well.)
>
> Alan Stern

Sure, I'd make that change (in the subsequent iteration) - it's much simpler.

Do you think that the patchset is now in a form that can be submitted
for upstream (-tip tree) acceptance?

Thanks,
K.Prasad

2009-04-24 16:16:54

by Alan Stern

[permalink] [raw]

Subject: Re: [Patch 00/11] Hardware Breakpoint interfaces

On Fri, 24 Apr 2009, K.Prasad wrote:

> Do you think that the patchset is now in a form that can be submitted
> for upstream (-tip tree) acceptance?

I'll let you know after I've had a chance to go over it thoroughly.

Alan Stern