I'll lead with a question. Are the people making the Perf API changes
booting allyesconfig kernels?
The regression tests built into the kernel for kgdb fail as a result of
the perf API changes and can result in a hard kernel hang. My hope
would have been that someone would have reported the problem, created a
patch to disable the test that hangs the kernel, or to fix kgdb such
that it works with the API changes. Likewise, if there are tests I
should run to regression test the changes to the perf API, I would like
to know since we all want to make use of the same hw resource.
A patch has been included in this mail which allows kgdb to pass the
internal regression tests for hw breakpoints. I would like to get some
comments or start a discussion as to how to solve this problem in the
2.6.33 cycle, as it is a regression in functionality.
I did not address this in the patch, but I found that the
hw_breakpoint.c API is lacking a function to query if there is a
breakpoint slot available on every processor, while atomic. Having that
would allow an end user to see an error generated up to gdb that there
are not enough breakpoints available, as opposed to finding out about it
via a printk message after you resume the system.
Thanks,
Jason.
* Jason Wessel <[email protected]> wrote:
> I'll lead with a question. Are the people making the Perf API changes
> booting allyesconfig kernels?
>
> The regression tests built into the kernel for kgdb fail as a result
> of the perf API changes and can result in a hard kernel hang.
>
> My hope would have been that someone would have reported the problem,
> created a patch to disable the test that hangs the kernel, or to fix
> kgdb such that it works with the API changes. Likewise, if there are
> tests I should run to regression test the changes to the perf API, I
> would like to know since we all want to make use of the same hw
> resource.
>
> A patch has been included in this mail which allows kgdb to pass the
> internal regression tests for hw breakpoints. I would like to get
> some comments or start a discussion as to how to solve this problem in
> the 2.6.33 cycle, as it is a regression in functionality.
Hm, the kgdb hw-breakpoint changes freshly put into v2.6.33 look pretty
broken: they access the raw hw registers and ignore the higher level
abstraction. Your patch still uses way too low level primitives. Please
use the highlevel abstraction: register/unregister_wide_hw_breakpoint()
should do the trick. (if there's any changes/extensions needed to it
then please let us know about it.)
If that's too much for v2.6.33 then i guess we need to revert or disable
the kgdb hw-breakpoint changes for now.
Thanks,
Ingo
* Ingo Molnar <[email protected]> wrote:
> Hm, the kgdb hw-breakpoint changes freshly put into v2.6.33 look
> pretty broken: [...]
Sorry - i thought for a moment that it was introduced by this recent
change:
7f8b7ed: kgdb: Always process the whole breakpoint list on activate or deactivate
but it's deeper than that indeed.
Basically we have two options:
A- change kgdb to use the hw-breakpoints highlevel APIs (i'd prefer
that)
B- or keep what we had so far: kgdb overrides existing GDB (and now
perf) breakpoints
I havent noticed that hw-breakpoints lock up under kgdb.
Ingo
Ingo Molnar wrote:
> * Ingo Molnar <[email protected]> wrote:
>
>
>> Hm, the kgdb hw-breakpoint changes freshly put into v2.6.33 look
>> pretty broken: [...]
>>
>
> Sorry - i thought for a moment that it was introduced by this recent
> change:
>
> 7f8b7ed: kgdb: Always process the whole breakpoint list on activate or deactivate
>
> but it's deeper than that indeed.
>
>
Yes, the broken parts came from the perf merge.
> Basically we have two options:
>
> A- change kgdb to use the hw-breakpoints highlevel APIs (i'd prefer
> that)
>
>
Right now we can't because the high level code uses all sorts of mutexes
and sync points to get the hw breakpoints installs on the various
processors. After I re-spun my RFC patch, I found another problem. I
do use the high level code to create a block of 4 (struct perf_event **)
structures, but doing so ultimately calls the reserve hw breakpoint even
though they are marked as disabled when created.
Should I, or can I change that behavior?
The aim of the RFC patch was to allow kgdb to using the arch_install*
break point functions directly because they can be used atomically, and
the perf code at the high level will still not get a breakpoint if kgdb
is already using it that way, and kgdb won't squash a breakpoint that
the perf code is using.
In the ideal world the management routines could be the same, but the
problem is in the way we want to deal with the break points in kgdb
while the master and all slave cpus are effectively stopped.
> B- or keep what we had so far: kgdb overrides existing GDB (and now
> perf) breakpoints
>
This is broken today, because the perf installs itself as the lowest
priority.
If kgdb were to get a lower priority there is a way to make it work
again using the existing code.
The RFC patch I made aligns things a little better in that at least
perf, user space, and kgdb are all pulling from the same low level
management code. It has the desirable side effect of making user space
hw breakpoints work in conjunction with kernel space hw breakpoints
which is something that never worked before.
> I havent noticed that hw-breakpoints lock up under kgdb.
>
I don't see the lockups all the time, it happens when using the kgdb
test suite from boot with the allyesconfig. Or if you boot the kernel
with kgdbts=V1F100.
I think for now we might just consider turning off the hw breakpoint
code in kgdb so we don't hang kernels, until this is sorted out. Patch
provided.
Jason.
On Sat, Dec 12, 2009 at 03:01:58PM -0600, Jason Wessel wrote:
> Ingo Molnar wrote:
> > Basically we have two options:
> >
> > A- change kgdb to use the hw-breakpoints highlevel APIs (i'd prefer
> > that)
> >
> >
>
> Right now we can't because the high level code uses all sorts of mutexes
> and sync points to get the hw breakpoints installs on the various
> processors. After I re-spun my RFC patch, I found another problem. I
> do use the high level code to create a block of 4 (struct perf_event **)
> structures, but doing so ultimately calls the reserve hw breakpoint even
> though they are marked as disabled when created.
>
> Should I, or can I change that behavior?
We could probably have a helper that allocates a disabled breakpoint
without reserving it. But the problem remains: you'll need to take
locks when you eventually reserve it and when you activate it.
The fact that it can happen from nmi is really a problem.
Is there any possibility that we know the user has started a
kgdb session, and then reserve as much hardware breakpoints
as we can in kgdb at this time?
Thanks.
Frederic Weisbecker wrote:
> On Sat, Dec 12, 2009 at 03:01:58PM -0600, Jason Wessel wrote:
>
>> Ingo Molnar wrote:
>>
>>> Basically we have two options:
>>>
>>> A- change kgdb to use the hw-breakpoints highlevel APIs (i'd prefer
>>> that)
>>>
>>>
>>>
>> Right now we can't because the high level code uses all sorts of mutexes
>> and sync points to get the hw breakpoints installs on the various
>> processors. After I re-spun my RFC patch, I found another problem. I
>> do use the high level code to create a block of 4 (struct perf_event **)
>> structures, but doing so ultimately calls the reserve hw breakpoint even
>> though they are marked as disabled when created.
>>
>> Should I, or can I change that behavior?
>>
>
>
>
> We could probably have a helper that allocates a disabled breakpoint
> without reserving it.
I worked around that restriction for now, in the current version of the
kgdb patches. When kgdb registers with the die notifier in its init
phase, it allocates the perf structures via the perf API and
subsequently disables the breakpoints with the low level API.
> But the problem remains: you'll need to take
> locks when you eventually reserve it and when you activate it.
>
> The fact that it can happen from nmi is really a problem.
>
>
I talked with Jan a bit with respect to this problem. He recommended to
possibly allow kgdb to obtain hw breakpoints locklessly and to break
reservations that exist with the low level API. The current patch in
the kgdb series does not break reservations, it only uses a slot that is
not already in use. Let us call the scenarios A and B.
A) allow kgdb to break existing reservations
B) kgdb can use what is not reserved, without locks
What is missing right now is a notification mechanism and a separate
count for the debugger as to what is in use. I tend to think that B is
the right default approach, but Jan was leaning towards scenario A.
> Is there any possibility that we know the user has started a
> kgdb session, and then reserve as much hardware breakpoints
> as we can in kgdb at this time?
>
>
That is the way I implemented it the first time. Reserve all the slots,
and then nothing else could use them. That didn't work out too well
because then the user space could not make use of hw breakpoints,
granted this never worked before with user space + kernel space sharing
between ptrace and kgdb.
Jason.
On Wed, Dec 30, 2009 at 10:53:11AM -0600, Jason Wessel wrote:
> Frederic Weisbecker wrote:
> > We could probably have a helper that allocates a disabled breakpoint
> > without reserving it.
>
> I worked around that restriction for now, in the current version of the
> kgdb patches. When kgdb registers with the die notifier in its init
> phase, it allocates the perf structures via the perf API and
> subsequently disables the breakpoints with the low level API.
It disables it but then the breakpoint still has a reserved slot,
that looks too much for an opt-in config that may or
may not be used. One slot among four would be irremediably
unavailable from userspace once we set CONFIG_KGDB, if I understand well...
Is it doing so in the beginning of a debugging session or
at boot time?
> > But the problem remains: you'll need to take
> > locks when you eventually reserve it and when you activate it.
> >
> > The fact that it can happen from nmi is really a problem.
> >
> >
>
> I talked with Jan a bit with respect to this problem. He recommended to
> possibly allow kgdb to obtain hw breakpoints locklessly and to break
> reservations that exist with the low level API. The current patch in
> the kgdb series does not break reservations, it only uses a slot that is
> not already in use. Let us call the scenarios A and B.
>
> A) allow kgdb to break existing reservations
> B) kgdb can use what is not reserved, without locks
>
> What is missing right now is a notification mechanism and a separate
> count for the debugger as to what is in use. I tend to think that B is
> the right default approach, but Jan was leaning towards scenario A.
A looks dangerous, in that an overflow of the possible number of
breakpoints would make them fighting for the debug registers.
Only 4 of them will make it :) (in x86)
I guess that in overflow case, the plan is to kick out one of the
running breakpoints.
I see several problems in this preempt scheme:
- which breakpoint should we preempt?
- that will bring some complexities in the current code. You'll need
to keep track of the preempted breakpoint, deactivate, reactivate it
etc... Moreover the activation things require to take some locks.
I don't think A is good idea.
B looks feasible, but only at the cost of using a best effort
try (in case of NMI).
You'll need to check if the current lock that protects the reservation
datas is already taken. If so you have to give up.
That said this should be fine as the breakpoint reservation is a rare
path, and I doubt one would try to set up a breakpoint in kgdb in the same
time perf does, or any other users.
That said, either A or B, you'll need to take locks to activate/deactivate
the breakpoints.
> > Is there any possibility that we know the user has started a
> > kgdb session, and then reserve as much hardware breakpoints
> > as we can in kgdb at this time?
> >
> >
>
> That is the way I implemented it the first time. Reserve all the slots,
> and then nothing else could use them. That didn't work out too well
> because then the user space could not make use of hw breakpoints,
> granted this never worked before with user space + kernel space sharing
> between ptrace and kgdb.
Say one opens a kgdb session. A very cool thing would be to reserve
as much breakpoints as we can (without prempting existing ones)
and release them once the session is closed. This is not a problem that
userspace can't reserve new ones during this session, we are debugging
the kernel at this time. I doubt we need userspace breakpoints at the same
time. Do we?
That said I really lack some kgdb background. In which context the
user is connecting to kgdb? The above would only work in a non-NMI
context.
Frederic Weisbecker wrote:
> On Wed, Dec 30, 2009 at 10:53:11AM -0600, Jason Wessel wrote:
>
>> Frederic Weisbecker wrote:
>>
>>> We could probably have a helper that allocates a disabled breakpoint
>>> without reserving it.
>>>
>> I worked around that restriction for now, in the current version of the
>> kgdb patches. When kgdb registers with the die notifier in its init
>> phase, it allocates the perf structures via the perf API and
>> subsequently disables the breakpoints with the low level API.
>>
>
>
>
> It disables it but then the breakpoint still has a reserved slot,
> that looks too much for an opt-in config that may or
> may not be used. One slot among four would be irremediably
> unavailable from userspace once we set CONFIG_KGDB, if I understand well...
> Is it doing so in the beginning of a debugging session or
> at boot time?
>
>
>
In my latest patch, a single HW breakpoint slot goes away only while
running through the arch specific kgdb init, it is immediately given
back after the kgdb arch init completes. The arch init either occurs at
boot via kernel cmd line, or is requested via "echo >
/sys/module/kgdboc/...".
Ideally, I would like to get this fixed or at least back to the state of
working where kgdb can make use of HW breakpoints in 2.6.33.
To address your questions.
I used the principle that kgdb can have what ever is "left over" in
terms of available breakpoints that are not presently in use by the perf
code. In order to configure kgdb there must be at least one HW
breakpoint available or configuration fails.
We assume there is a person running the kernel debugger and has some
knowledge about what all is going on and will make a decision about how
he or she wants to allocate HW breakpoint resources.
>
>
>>> Is there any possibility that we know the user has started a
>>> kgdb session, and then reserve as much hardware breakpoints
>>> as we can in kgdb at this time?
>>>
>>>
>>>
>> That is the way I implemented it the first time. Reserve all the slots,
>> and then nothing else could use them. That didn't work out too well
>> because then the user space could not make use of hw breakpoints,
>> granted this never worked before with user space + kernel space sharing
>> between ptrace and kgdb.
>>
>
>
>
> Say one opens a kgdb session. A very cool thing would be to reserve
> as much breakpoints as we can (without prempting existing ones)
> and release them once the session is closed. This is not a problem that
> userspace can't reserve new ones during this session, we are debugging
> the kernel at this time. I doubt we need userspace breakpoints at the same
> time. Do we?
>
>
You can end up using user space HW breakpoints if you happen to be
running an application that uses ptrace to set them. The current kgdb
has never worked with user and system breaks.
> That said I really lack some kgdb background. In which context the
> user is connecting to kgdb? The above would only work in a non-NMI
> context.
>
>
As I understand it, this is mainly about how the context switching
occurs. There is some scheduling that sets and unsets the HW
breakpoints which are managed through the perf API. You can have a perf
system level HW break, a user space HW break (gdb debugging for
instance), or kgdb setting a HW break. And as the system changes
contexts, the kgdb breaks would in theory get restored by virtue of use
of the low level API.
Do you have a test case for some typical use of perf + a HW breakpoint?
Perhaps we can narrow down the problem set to the typical cases so as to
move on.
Here is a simple case for instance:
* User sets perf HW breakpoint to get some data (perhaps you have an
example?)
* User makes use of KGDB to set HW breakpoint at do_fork()
Simple case 2:
* User uses a HW breakpoint in a looped app and does a "c 1000000" in gdb
* User uses KGDB to set a HW breakpoint in do_fork()
I figure you iterate a few times through to see that it is working. It
is possible to make an in kernel test case as well, but I figure it
would be best to describe some typical, plausible case first.
Thanks,
Jason.
--
The patch is not inlined here but we are talking about:
http://git.kernel.org/?p=linux/kernel/git/jwessel/linux-2.6-kgdb.git;a=commitdiff;h=15bc6ce269f1d1f46751236bf5f099e051aa65ee