The hw_breakpoint subsystem consumes all the hardware
breakpoint exceptions since it hooks the notify_die
handlers first, this means that kgdb doesn't get the
opportunity to handle hw breakpoint exceptions generated
by kgdb itself.
This patch adds an extend flag to perf_event_attr for
hw_breakpoint_handler() to decide to pass or stop the
DIE_DEBUG notification.
As KGDB set that flag, hw_breakpoint_handler() will pass
the DIE_DEBUG notification, thus kgdb have the chance
to take DIE_DEBUG notification.
Signed-off-by: Dongdong Deng <[email protected]>
Reviewed-by: Bruce Ashfield <[email protected]>
---
arch/x86/kernel/hw_breakpoint.c | 14 ++++++++++++++
arch/x86/kernel/kgdb.c | 2 ++
include/linux/perf_event.h | 9 +++++++++
3 files changed, 25 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index a8f1b80..b38f786 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -406,6 +406,8 @@ EXPORT_SYMBOL_GPL(hw_breakpoint_restore);
* ii) When there are more bits than trap<n> set in DR6 register (such
* as BD, BS or BT) indicating that more than one debug condition is
* met and requires some more action in do_debug().
+ * iii) The source of hw breakpoint event want to handle the event
+ * by itself, currently just KGDB have this notion.
*
* NOTIFY_STOP returned for all other cases
*
@@ -464,6 +466,18 @@ static int __kprobes hw_breakpoint_handler(struct die_args *args)
break;
}
+ if (bp->attr.flag == SKIP_HWBP_EVENT_PERF_FLAG) {
+ /*
+ * when attr.flag is set to SKIP_HWBP_EVENT_PERF_FLAG
+ * it indicates currently hw breakpoint event
+ * source want to handle this event by itself.
+ * thus return NOTIFY_DONE here.
+ */
+ rc = NOTIFY_DONE;
+ rcu_read_unlock();
+ break;
+ }
+
perf_bp_event(bp, args->regs);
rcu_read_unlock();
diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
index 4f4af75..bc3321f 100644
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -641,6 +641,8 @@ void kgdb_arch_late(void)
attr.bp_len = HW_BREAKPOINT_LEN_1;
attr.bp_type = HW_BREAKPOINT_W;
attr.disabled = 1;
+ /* set kgdb's special requires flag to perf_event */
+ attr.flag = SKIP_HWBP_EVENT_PERF_FLAG;
for (i = 0; i < 4; i++) {
if (breakinfo[i].pev)
continue;
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 5d0266d..fcefb09 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -160,6 +160,11 @@ enum perf_event_read_format {
#define PERF_ATTR_SIZE_VER0 64 /* sizeof first published struct */
+enum perf_event_attr_flag {
+ NOSET_PERF_FLAG = 0,
+ SKIP_HWBP_EVENT_PERF_FLAG = 1,
+};
+
/*
* Hardware event_id to monitor via a performance monitoring event:
*/
@@ -225,6 +230,10 @@ struct perf_event_attr {
__u32 bp_type;
__u64 bp_addr;
__u64 bp_len;
+ /*
+ * Ext flag, currently just kgdb used it.
+ */
+ enum perf_event_attr_flag flag;
};
/*
--
1.6.0.4
On Fri, Jul 23, 2010 at 10:16:01AM +0800, Dongdong Deng wrote:
> The hw_breakpoint subsystem consumes all the hardware
> breakpoint exceptions since it hooks the notify_die
> handlers first, this means that kgdb doesn't get the
> opportunity to handle hw breakpoint exceptions generated
> by kgdb itself.
>
> This patch adds an extend flag to perf_event_attr for
> hw_breakpoint_handler() to decide to pass or stop the
> DIE_DEBUG notification.
>
> As KGDB set that flag, hw_breakpoint_handler() will pass
> the DIE_DEBUG notification, thus kgdb have the chance
> to take DIE_DEBUG notification.
>
> Signed-off-by: Dongdong Deng <[email protected]>
> Reviewed-by: Bruce Ashfield <[email protected]>
> ---
> arch/x86/kernel/hw_breakpoint.c | 14 ++++++++++++++
> arch/x86/kernel/kgdb.c | 2 ++
> include/linux/perf_event.h | 9 +++++++++
> 3 files changed, 25 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
> index a8f1b80..b38f786 100644
> --- a/arch/x86/kernel/hw_breakpoint.c
> +++ b/arch/x86/kernel/hw_breakpoint.c
> @@ -406,6 +406,8 @@ EXPORT_SYMBOL_GPL(hw_breakpoint_restore);
> * ii) When there are more bits than trap<n> set in DR6 register (such
> * as BD, BS or BT) indicating that more than one debug condition is
> * met and requires some more action in do_debug().
> + * iii) The source of hw breakpoint event want to handle the event
> + * by itself, currently just KGDB have this notion.
> *
> * NOTIFY_STOP returned for all other cases
> *
> @@ -464,6 +466,18 @@ static int __kprobes hw_breakpoint_handler(struct die_args *args)
> break;
> }
>
> + if (bp->attr.flag == SKIP_HWBP_EVENT_PERF_FLAG) {
> + /*
> + * when attr.flag is set to SKIP_HWBP_EVENT_PERF_FLAG
> + * it indicates currently hw breakpoint event
> + * source want to handle this event by itself.
> + * thus return NOTIFY_DONE here.
> + */
> + rc = NOTIFY_DONE;
> + rcu_read_unlock();
> + break;
> + }
> +
No. We really shouldn't make a user ABI change (adding attr.flag) just
to solve an in-kernel-only problem.
And moreover we probably don't need flags at all. Why not just turning kgdb handler
into a higher priority?
I don't even remember why kgdb has its own handler instead of using the
struct perf_event:overflow_handler. May be that's because of the early breakpoints.
On 07/23/2010 08:04 AM, Frederic Weisbecker wrote:
> On Fri, Jul 23, 2010 at 10:16:01AM +0800, Dongdong Deng wrote:
>
>> The hw_breakpoint subsystem consumes all the hardware
>> breakpoint exceptions since it hooks the notify_die
>> handlers first, this means that kgdb doesn't get the
>> opportunity to handle hw breakpoint exceptions generated
>> by kgdb itself.
>>
>> This patch adds an extend flag to perf_event_attr for
>> hw_breakpoint_handler() to decide to pass or stop the
>> DIE_DEBUG notification.
>>
>> As KGDB set that flag, hw_breakpoint_handler() will pass
>> the DIE_DEBUG notification, thus kgdb have the chance
>> to take DIE_DEBUG notification.
>>
>> Signed-off-by: Dongdong Deng <[email protected]>
>> Reviewed-by: Bruce Ashfield <[email protected]>
>> ---
>> arch/x86/kernel/hw_breakpoint.c | 14 ++++++++++++++
>> arch/x86/kernel/kgdb.c | 2 ++
>> include/linux/perf_event.h | 9 +++++++++
>> 3 files changed, 25 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
>> index a8f1b80..b38f786 100644
>> --- a/arch/x86/kernel/hw_breakpoint.c
>> +++ b/arch/x86/kernel/hw_breakpoint.c
>> @@ -406,6 +406,8 @@ EXPORT_SYMBOL_GPL(hw_breakpoint_restore);
>> * ii) When there are more bits than trap<n> set in DR6 register (such
>> * as BD, BS or BT) indicating that more than one debug condition is
>> * met and requires some more action in do_debug().
>> + * iii) The source of hw breakpoint event want to handle the event
>> + * by itself, currently just KGDB have this notion.
>> *
>> * NOTIFY_STOP returned for all other cases
>> *
>> @@ -464,6 +466,18 @@ static int __kprobes hw_breakpoint_handler(struct die_args *args)
>> break;
>> }
>>
>> + if (bp->attr.flag == SKIP_HWBP_EVENT_PERF_FLAG) {
>> + /*
>> + * when attr.flag is set to SKIP_HWBP_EVENT_PERF_FLAG
>> + * it indicates currently hw breakpoint event
>> + * source want to handle this event by itself.
>> + * thus return NOTIFY_DONE here.
>> + */
>> + rc = NOTIFY_DONE;
>> + rcu_read_unlock();
>> + break;
>> + }
>> +
>>
>
>
>
> No. We really shouldn't make a user ABI change (adding attr.flag) just
> to solve an in-kernel-only problem.
>
> And moreover we probably don't need flags at all. Why not just turning kgdb handler
> into a higher priority?
>
> I don't even remember why kgdb has its own handler instead of using the
> struct perf_event:overflow_handler. May be that's because of the early breakpoints.
>
>
The patch may or may not be the right way to solve the problem. It is
worth noting that early breakpoints are handled separately with a direct
writes to the debug registers so this API does not apply.
This patch effectively causes the events to get passed to the normal
handlers.
The source of the original problem (which was merged in 2.6.35) is
commit: 018cbffe6819f6f8db20a0a3acd9bab9bfd667e4 - Merge commit
'v2.6.33' into perf/core
Specifically this line right here:
@@@ -502,6 -486,8 +486,6 @@@ static int __kprobes hw_breakpoint_hand
rcu_read_lock();
bp = per_cpu(bp_per_reg[i], cpu);
- if (bp)
- rc = NOTIFY_DONE;
Because the NOTIFY_DONE is never set, a default value of NOTIFY_STOP is
passed at the end and kgdb never gets to see the break point even that
was never intended for the perf handler in the first place.
It is not as easy of course to just revert this patch because it changed
other logic.
Jason.
On Fri, Jul 23, 2010 at 08:19:54AM -0500, Jason Wessel wrote:
> On 07/23/2010 08:04 AM, Frederic Weisbecker wrote:
> > On Fri, Jul 23, 2010 at 10:16:01AM +0800, Dongdong Deng wrote:
> >
> >> The hw_breakpoint subsystem consumes all the hardware
> >> breakpoint exceptions since it hooks the notify_die
> >> handlers first, this means that kgdb doesn't get the
> >> opportunity to handle hw breakpoint exceptions generated
> >> by kgdb itself.
> >>
> >> This patch adds an extend flag to perf_event_attr for
> >> hw_breakpoint_handler() to decide to pass or stop the
> >> DIE_DEBUG notification.
> >>
> >> As KGDB set that flag, hw_breakpoint_handler() will pass
> >> the DIE_DEBUG notification, thus kgdb have the chance
> >> to take DIE_DEBUG notification.
> >>
> >> Signed-off-by: Dongdong Deng <[email protected]>
> >> Reviewed-by: Bruce Ashfield <[email protected]>
> >> ---
> >> arch/x86/kernel/hw_breakpoint.c | 14 ++++++++++++++
> >> arch/x86/kernel/kgdb.c | 2 ++
> >> include/linux/perf_event.h | 9 +++++++++
> >> 3 files changed, 25 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
> >> index a8f1b80..b38f786 100644
> >> --- a/arch/x86/kernel/hw_breakpoint.c
> >> +++ b/arch/x86/kernel/hw_breakpoint.c
> >> @@ -406,6 +406,8 @@ EXPORT_SYMBOL_GPL(hw_breakpoint_restore);
> >> * ii) When there are more bits than trap<n> set in DR6 register (such
> >> * as BD, BS or BT) indicating that more than one debug condition is
> >> * met and requires some more action in do_debug().
> >> + * iii) The source of hw breakpoint event want to handle the event
> >> + * by itself, currently just KGDB have this notion.
> >> *
> >> * NOTIFY_STOP returned for all other cases
> >> *
> >> @@ -464,6 +466,18 @@ static int __kprobes hw_breakpoint_handler(struct die_args *args)
> >> break;
> >> }
> >>
> >> + if (bp->attr.flag == SKIP_HWBP_EVENT_PERF_FLAG) {
> >> + /*
> >> + * when attr.flag is set to SKIP_HWBP_EVENT_PERF_FLAG
> >> + * it indicates currently hw breakpoint event
> >> + * source want to handle this event by itself.
> >> + * thus return NOTIFY_DONE here.
> >> + */
> >> + rc = NOTIFY_DONE;
> >> + rcu_read_unlock();
> >> + break;
> >> + }
> >> +
> >>
> >
> >
> >
> > No. We really shouldn't make a user ABI change (adding attr.flag) just
> > to solve an in-kernel-only problem.
> >
> > And moreover we probably don't need flags at all. Why not just turning kgdb handler
> > into a higher priority?
> >
> > I don't even remember why kgdb has its own handler instead of using the
> > struct perf_event:overflow_handler. May be that's because of the early breakpoints.
> >
> >
>
>
> The patch may or may not be the right way to solve the problem. It is
> worth noting that early breakpoints are handled separately with a direct
> writes to the debug registers so this API does not apply.
But you still need to handle them on the debug exception, right?
> This patch effectively causes the events to get passed to the normal
> handlers.
>
> The source of the original problem (which was merged in 2.6.35) is
> commit: 018cbffe6819f6f8db20a0a3acd9bab9bfd667e4 - Merge commit
> 'v2.6.33' into perf/core
>
> Specifically this line right here:
> @@@ -502,6 -486,8 +486,6 @@@ static int __kprobes hw_breakpoint_hand
> rcu_read_lock();
>
> bp = per_cpu(bp_per_reg[i], cpu);
> - if (bp)
> - rc = NOTIFY_DONE;
>
> Because the NOTIFY_DONE is never set, a default value of NOTIFY_STOP is
> passed at the end and kgdb never gets to see the break point even that
> was never intended for the perf handler in the first place.
>
> It is not as easy of course to just revert this patch because it changed
> other logic.
>
> Jason.
Right.
Actually NOTIFY_DONE is returned when there is more work to do: handling
another exception than breakpoint, or sending a signal. Otherwise yeah,
we return NOTIFY_STOP as we assume there is more work to do.
So the following alternatives appear to me:
- Moving the breakpoint exception handling into the
struct perf_event:overflow_handler. In fact I can't find the breakpoint
handling in kgdb.c
- Have a higher priority in kgdb notifier (which means decreasing the one
of hw_breakpoint.c)
- Always returning NOTIFY_DONE from the breakpoint path.
Is this a regression BTW?
On 07/23/2010 09:07 AM, Frederic Weisbecker wrote:
> On Fri, Jul 23, 2010 at 08:19:54AM -0500, Jason Wessel wrote:
>
>> On 07/23/2010 08:04 AM, Frederic Weisbecker wrote:
>>
>> The patch may or may not be the right way to solve the problem. It is
>> worth noting that early breakpoints are handled separately with a direct
>> writes to the debug registers so this API does not apply.
>>
>
>
>
> But you still need to handle them on the debug exception, right?
>
>
>
Yes, but at that point kgdb is first in line for the notifier so it
works out of the box.
>
>
>> This patch effectively causes the events to get passed to the normal
>> handlers.
>>
>> The source of the original problem (which was merged in 2.6.35) is
>> commit: 018cbffe6819f6f8db20a0a3acd9bab9bfd667e4 - Merge commit
>> 'v2.6.33' into perf/core
>>
>> Specifically this line right here:
>> @@@ -502,6 -486,8 +486,6 @@@ static int __kprobes hw_breakpoint_hand
>> rcu_read_lock();
>>
>> bp = per_cpu(bp_per_reg[i], cpu);
>> - if (bp)
>> - rc = NOTIFY_DONE;
>>
>> Because the NOTIFY_DONE is never set, a default value of NOTIFY_STOP is
>> passed at the end and kgdb never gets to see the break point even that
>> was never intended for the perf handler in the first place.
>>
>> It is not as easy of course to just revert this patch because it changed
>> other logic.
>>
>> Jason.
>>
>
>
>
> Right.
>
> Actually NOTIFY_DONE is returned when there is more work to do: handling
> another exception than breakpoint, or sending a signal. Otherwise yeah,
> we return NOTIFY_STOP as we assume there is more work to do.
>
>
For this specific case the hw_breakpoint handler simply consumed a
breakpoint which was not intended for it.
> So the following alternatives appear to me:
>
> - Moving the breakpoint exception handling into the
> struct perf_event:overflow_handler. In fact I can't find the breakpoint
> handling in kgdb.c
>
>
It is in the generic die notification handler for kgdb (looking at
2.6.35-rc6)
arch/x86/kernel/kgdb.c
516 static int __kgdb_notify(struct die_args *args, unsigned long cmd)
...
551 case DIE_DEBUG:
552 if (atomic_read(&kgdb_cpu_doing_single_step) !=
-1) {
553 if (user_mode(regs))
554 return single_step_cont(regs, args);
555 break;
556 } else if (test_thread_flag(TIF_SINGLESTEP))
557 /* This means a user thread is single
stepping
558 * a system call which should be ignored
559 */
560 return NOTIFY_DONE;
561 /* fall through */
> - Have a higher priority in kgdb notifier (which means decreasing the one
> of hw_breakpoint.c)
>
kgdb had always been last in line in arch/x86/kernel/kgdb.c:
608 static struct notifier_block kgdb_notifier = {
609 .notifier_call = kgdb_notify,
610
611 /*
612 * Lowest-prio notifier priority, we want to be notified
last:
613 */
614 .priority = -INT_MAX,
615 };
> - Always returning NOTIFY_DONE from the breakpoint path.
>
>
Without some further investigation, I am not sure what this will do. We
don't want to make things worse of course. Because kgdb uses the
request hw_breakpoint api to request slot reservation having an
attribute to say don't do anything to this HW breakpoint is certainly
one way to fix it.
> Is this a regression BTW?
>
>
Absolutely this is a regression. No change was made in kgdb related to
this and the kgdb HW breakpoint regression tests (which come with the
kernel) stopped working and bisect to the commit I mentioned.
Jason.
On Fri, Jul 23, 2010 at 10:49:20AM -0500, Jason Wessel wrote:
> On 07/23/2010 09:07 AM, Frederic Weisbecker wrote:
> > On Fri, Jul 23, 2010 at 08:19:54AM -0500, Jason Wessel wrote:
> >
> >> On 07/23/2010 08:04 AM, Frederic Weisbecker wrote:
> >>
> >> The patch may or may not be the right way to solve the problem. It is
> >> worth noting that early breakpoints are handled separately with a direct
> >> writes to the debug registers so this API does not apply.
> >>
> >
> >
> >
> > But you still need to handle them on the debug exception, right?
> >
> >
> >
>
> Yes, but at that point kgdb is first in line for the notifier so it
> works out of the box.
Ok.
> > Right.
> >
> > Actually NOTIFY_DONE is returned when there is more work to do: handling
> > another exception than breakpoint, or sending a signal. Otherwise yeah,
> > we return NOTIFY_STOP as we assume there is more work to do.
> >
> >
>
> For this specific case the hw_breakpoint handler simply consumed a
> breakpoint which was not intended for it.
Ah right.
But that thing is right:
/*
* Reset the 'i'th TRAP bit in dr6 to denote completion of
* exception handling
*/
(*dr6_p) &= ~(DR_TRAP0 << i);
/*
* bp can be NULL due to lazy debug register switching
* or due to concurrent perf counter removing.
*/
if (!bp) {
rcu_read_unlock();
break;
}
We need to prevent from dr7 lazy switches. It means kgdb must first check
its own breakpoints.
> > So the following alternatives appear to me:
> >
> > - Moving the breakpoint exception handling into the
> > struct perf_event:overflow_handler. In fact I can't find the breakpoint
> > handling in kgdb.c
> >
> >
>
> It is in the generic die notification handler for kgdb (looking at
> 2.6.35-rc6)
>
> arch/x86/kernel/kgdb.c
>
> 516 static int __kgdb_notify(struct die_args *args, unsigned long cmd)
> ...
> 551 case DIE_DEBUG:
> 552 if (atomic_read(&kgdb_cpu_doing_single_step) !=
> -1) {
> 553 if (user_mode(regs))
> 554 return single_step_cont(regs, args);
> 555 break;
> 556 } else if (test_thread_flag(TIF_SINGLESTEP))
> 557 /* This means a user thread is single
> stepping
> 558 * a system call which should be ignored
> 559 */
> 560 return NOTIFY_DONE;
> 561 /* fall through */
But I can't find where the breakpoints are handled there.
>
> > - Have a higher priority in kgdb notifier (which means decreasing the one
> > of hw_breakpoint.c)
> >
>
> kgdb had always been last in line in arch/x86/kernel/kgdb.c:
>
> 608 static struct notifier_block kgdb_notifier = {
> 609 .notifier_call = kgdb_notify,
> 610
> 611 /*
> 612 * Lowest-prio notifier priority, we want to be notified
> last:
> 613 */
> 614 .priority = -INT_MAX,
> 615 };
Why? It seems to me a kernel debugger should have the highest priority
over anything.
>
> > - Always returning NOTIFY_DONE from the breakpoint path.
> >
> >
>
> Without some further investigation, I am not sure what this will do.
Nothing, this NOTIFY_STOP is only an optimization. But now I think that
won't solve the problem. We still clear a dr6 trap bit for a debug
exception due to lazy dr7 switches we have to handle.
This is why kgdb should have the highest priority, or use the overflow
callback.
> We
> don't want to make things worse of course. Because kgdb uses the
> request hw_breakpoint api to request slot reservation having an
> attribute to say don't do anything to this HW breakpoint is certainly
> one way to fix it.
>
> > Is this a regression BTW?
> >
> >
>
> Absolutely this is a regression. No change was made in kgdb related to
> this and the kgdb HW breakpoint regression tests (which come with the
> kernel) stopped working and bisect to the commit I mentioned.
Yep, this new breakpoint layer has been a PITA for kgdb :)
Frederic Weisbecker wrote:
> On Fri, Jul 23, 2010 at 10:49:20AM -0500, Jason Wessel wrote:
>> On 07/23/2010 09:07 AM, Frederic Weisbecker wrote:
>>> On Fri, Jul 23, 2010 at 08:19:54AM -0500, Jason Wessel wrote:
>>>
>>>> On 07/23/2010 08:04 AM, Frederic Weisbecker wrote:
>>>>
>>>> The patch may or may not be the right way to solve the problem. It is
>>>> worth noting that early breakpoints are handled separately with a direct
>>>> writes to the debug registers so this API does not apply.
>>>>
>>>
>>>
>>> But you still need to handle them on the debug exception, right?
>>>
>>>
>>>
>> Yes, but at that point kgdb is first in line for the notifier so it
>> works out of the box.
>
>
> Ok.
>
>
>
>>> Right.
>>>
>>> Actually NOTIFY_DONE is returned when there is more work to do: handling
>>> another exception than breakpoint, or sending a signal. Otherwise yeah,
>>> we return NOTIFY_STOP as we assume there is more work to do.
>>>
>>>
>> For this specific case the hw_breakpoint handler simply consumed a
>> breakpoint which was not intended for it.
>
>
>
> Ah right.
>
> But that thing is right:
>
> /*
> * Reset the 'i'th TRAP bit in dr6 to denote completion of
> * exception handling
> */
> (*dr6_p) &= ~(DR_TRAP0 << i);
> /*
> * bp can be NULL due to lazy debug register switching
> * or due to concurrent perf counter removing.
> */
> if (!bp) {
> rcu_read_unlock();
> break;
> }
>
>
> We need to prevent from dr7 lazy switches. It means kgdb must first check
> its own breakpoints.
>
>
>>> So the following alternatives appear to me:
>>>
>>> - Moving the breakpoint exception handling into the
>>> struct perf_event:overflow_handler. In fact I can't find the breakpoint
>>> handling in kgdb.c
>>>
>>>
>> It is in the generic die notification handler for kgdb (looking at
>> 2.6.35-rc6)
>>
>> arch/x86/kernel/kgdb.c
>>
>> 516 static int __kgdb_notify(struct die_args *args, unsigned long cmd)
>> ...
>> 551 case DIE_DEBUG:
>> 552 if (atomic_read(&kgdb_cpu_doing_single_step) !=
>> -1) {
>> 553 if (user_mode(regs))
>> 554 return single_step_cont(regs, args);
>> 555 break;
>> 556 } else if (test_thread_flag(TIF_SINGLESTEP))
>> 557 /* This means a user thread is single
>> stepping
>> 558 * a system call which should be ignored
>> 559 */
>> 560 return NOTIFY_DONE;
>> 561 /* fall through */
>
>
>
> But I can't find where the breakpoints are handled there.
>
>
>
>>> - Have a higher priority in kgdb notifier (which means decreasing the one
>>> of hw_breakpoint.c)
>>>
>> kgdb had always been last in line in arch/x86/kernel/kgdb.c:
>>
>> 608 static struct notifier_block kgdb_notifier = {
>> 609 .notifier_call = kgdb_notify,
>> 610
>> 611 /*
>> 612 * Lowest-prio notifier priority, we want to be notified
>> last:
>> 613 */
>> 614 .priority = -INT_MAX,
>> 615 };
>
>
>
> Why? It seems to me a kernel debugger should have the highest priority
> over anything.
In my option, the reason of kgdb set the lowest-prio for
notifier is:
For letting kgdb to keep simple, there is no codes to check the
breakpoint event was generated by kgdb or not, thus it have to set kgdb
as lowest priority to notifier.
If the breakpoint event is not generated by kgdb, the source of the
breakpoint event will consume that event before passing to kgdb's
routine, so that the breakpoint event of kgdb getting must be generated
by kgdb itself.
>
>
>
>>> - Always returning NOTIFY_DONE from the breakpoint path.
>>>
>>>
>> Without some further investigation, I am not sure what this will do.
>
>
>
> Nothing, this NOTIFY_STOP is only an optimization. But now I think that
> won't solve the problem. We still clear a dr6 trap bit for a debug
> exception due to lazy dr7 switches we have to handle.
>
> This is why kgdb should have the highest priority, or use the overflow
> callback.
OK, I will try to use the overflow callback to let kgdb works with
hw_breakpoints. :-)
Thanks,
Dongdong
>
>
>
>> We
>> don't want to make things worse of course. Because kgdb uses the
>> request hw_breakpoint api to request slot reservation having an
>> attribute to say don't do anything to this HW breakpoint is certainly
>> one way to fix it.
>>
>>> Is this a regression BTW?
>>>
>>>
>> Absolutely this is a regression. No change was made in kgdb related to
>> this and the kgdb HW breakpoint regression tests (which come with the
>> kernel) stopped working and bisect to the commit I mentioned.
>
>
> Yep, this new breakpoint layer has been a PITA for kgdb :)
>
>
On Mon, Jul 26, 2010 at 07:13:23PM +0800, DDD wrote:
> Frederic Weisbecker wrote:
>> Why? It seems to me a kernel debugger should have the highest priority
>> over anything.
>
> In my option, the reason of kgdb set the lowest-prio for
> notifier is:
>
> For letting kgdb to keep simple, there is no codes to check the
> breakpoint event was generated by kgdb or not, thus it have to set kgdb
> as lowest priority to notifier.
>
> If the breakpoint event is not generated by kgdb, the source of the
> breakpoint event will consume that event before passing to kgdb's
> routine, so that the breakpoint event of kgdb getting must be generated
> by kgdb itself.
Ok, but that makes it hard to differentiate from a spurious breakpoint
event.
>>
>>
>>
>>>> - Always returning NOTIFY_DONE from the breakpoint path.
>>>>
>>>>
>>> Without some further investigation, I am not sure what this will do.
>>
>>
>>
>> Nothing, this NOTIFY_STOP is only an optimization. But now I think that
>> won't solve the problem. We still clear a dr6 trap bit for a debug
>> exception due to lazy dr7 switches we have to handle.
>>
>> This is why kgdb should have the highest priority, or use the overflow
>> callback.
>
>
> OK, I will try to use the overflow callback to let kgdb works with
> hw_breakpoints. :-)
>
> Thanks,
> Dongdong
Thanks.
kgdb sets breakpoints through arch_install_hw_breakpoint()
So when it triggers, it will be handled by perf through
perf_bp_event(), so it's quite natural it is considered as
handled and then it's bit removed from dr6. The only way
for kgdb to handle it properly is to set an overflow_handler.
Tell me if you encounter any problem.
On 07/28/2010 12:08 PM, Frederic Weisbecker wrote:
> On Mon, Jul 26, 2010 at 07:13:23PM +0800, DDD wrote:
>
>> Frederic Weisbecker wrote:
>>
>>> Why? It seems to me a kernel debugger should have the highest priority
>>> over anything.
>>>
>> In my option, the reason of kgdb set the lowest-prio for
>> notifier is:
>>
>> For letting kgdb to keep simple, there is no codes to check the
>> breakpoint event was generated by kgdb or not, thus it have to set kgdb
>> as lowest priority to notifier.
>>
>> If the breakpoint event is not generated by kgdb, the source of the
>> breakpoint event will consume that event before passing to kgdb's
>> routine, so that the breakpoint event of kgdb getting must be generated
>> by kgdb itself.
>>
>
>
>
> Ok, but that makes it hard to differentiate from a spurious breakpoint
> event.
>
>
>
>
The original thinking was that if you are using a low level debugger
that you would want to stop on such a event or breakpoint because there
is nothing else handling it and your system is about to print an oops
message.
Jason.