2008-08-28 16:31:44

by Marlow Weston

[permalink] [raw]
Subject: HELP! KProbes bug

Hello persons in the current kernel maintainers file under KProbes:

I can't find this bug reported anywhere nor somewhere useful for
reporting it, so I chose that location to find people to write. If
there is somewhere else I should be sending this to, please tell me and
I will redirect it there.

I think I have found a KProbes bug when I turn on the KProbes via a proc
file call instead of via the init code. The stack trace is attached and
seems to indicate a locking issue. Also attached is module code that
will make this happen. Any advice on where to start hunting this down
would be greatly appreciated.

If the KProbes are going by quickly, ie there is no
schedule_timeout_interrupt(), then the problem doesn't show up. This
problem is exacerbated by the probes actually doing things that take
time while other probes are attempting to register. Also, I don't
believe it has to do with any particular probe as which probe locks up
varies (and my mad attempts at commenting out various probes did not work).

Note: if you do debug this, do not use any kernel later than the 2.6.25
kernel as somewhere a bug was introduced causing a hang having to do
with the timeouts, not necessarily with KProbes.

Thank you,
--Marlow Weston


Attachments:
kretprobe.jpg (178.63 kB)
082808_test_2.tgz (12.51 kB)
Download all attachments

2008-08-28 17:05:25

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: HELP! KProbes bug

Hi Marlow,

Marlow Weston wrote:
> Hello persons in the current kernel maintainers file under KProbes:
>
> I can't find this bug reported anywhere nor somewhere useful for
> reporting it, so I chose that location to find people to write. If
> there is somewhere else I should be sending this to, please tell me and
> I will redirect it there.
>
> I think I have found a KProbes bug when I turn on the KProbes via a proc
> file call instead of via the init code. The stack trace is attached and
> seems to indicate a locking issue. Also attached is module code that
> will make this happen. Any advice on where to start hunting this down
> would be greatly appreciated.
>
> If the KProbes are going by quickly, ie there is no
> schedule_timeout_interrupt(), then the problem doesn't show up. This
> problem is exacerbated by the probes actually doing things that take
> time while other probes are attempting to register. Also, I don't
> believe it has to do with any particular probe as which probe locks up
> varies (and my mad attempts at commenting out various probes did not work).

Why would you like to call scheduler from probe handler?

By design, kprobe handler MUST NOT call scheduling functions because it
checks recursive call by using per-cpu variable and scheduler might move
probed process to other cpu. Especially, since kretprobe uses a spinlock
(or hashed spinlocks on recently kernel), if the handler calls scheduler,
it will cause deadlock.

Anyway, if you read Documentation/kprobes.txt carefully, you can find
below paragraph;

---
5. Kprobes Features and Limitations
[...]
Probe handlers are run with preemption disabled. Depending on the
architecture, handlers may also run with interrupts disabled. In any
case, your handler should not yield the CPU (e.g., by attempting to
acquire a semaphore).
---
(I think this note should prohibit process scheduling more clearly.)

So, it's not a bug of kprobes. It's a known limitation.

Thank you,


--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: [email protected]