2002-10-21 12:24:02

by Richard J Moore

[permalink] [raw]
Subject: 2.4 Ready list - Kernel Hooks


Kernel Hooks is also ready,
see: http://www-124.ibm.com/linux/projects/kernelhooks/


Richard
RAS Project Lead - IBM Linux Technology Centre



2002-10-22 20:57:13

by Werner Almesberger

[permalink] [raw]
Subject: Re: 2.4 Ready list - Kernel Hooks

Richard J Moore wrote:
> Kernel Hooks is also ready,

I'm a bit puzzled as to what those hooks accomplish. They look
like a less flexible but a little faster and more portable
variant of kprobes.

Is this what they are ? If yes, does it really make sense to
have two so similar mechanisms for tapping into execution flows
in the kernel ?

- Werner

--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina [email protected] /
/_http://www.almesberger.net/____________________________________________/

2002-10-22 23:10:03

by Richard J Moore

[permalink] [raw]
Subject: Re: 2.4 Ready list - Kernel Hooks


Werner Almesberger wrote:

>Richard J Moore wrote:
>> Kernel Hooks is also ready,
>
>I'm a bit puzzled as to what those hooks accomplish. They look
>like a less flexible but a little faster and more portable
>variant of kprobes.
>
>Is this what they are ? If yes, does it really make sense to
>have two so similar mechanisms for tapping into execution flows
>in the kernel ?
>
>- Werner

Hello Werner, the two things are different:

kprobes
-------

This is kernel interface that allows kernel modules register to register
one or more probes.
A probes comprises a breakpoint location, a breakpoint handler and a post
single-step handler.
Why use the term probes? Because we don't intend to hijack the system,
merely register a location where we can seamlessly gather data and
continue.
The sequence of events that occurs when code containing a probepoint
executes are:
1) The associated probe handler is invoked.
2) The probe handler returns.
3) The probed instruction is single-stepped.
4) The post-single-step handler is called.
5) The post-single-step handler returns.
6) The probed code continues execution.

kprobes confines itself to kernel-space probepoints, which are implemented
using a breakpoint instruction.
There are three incremental patches that Vamsi submitted today which extend
krpobes as follows:
1) debug register management - provides a kernel interface for debug
register allocation and deallocation so debuggers can co-exists e.g
kprobes, ptrace, kdb etc..
2) kwatch points - allows probes to be set using debug registers. This
allows probes to fire on data accesses for example.
3) user space probes - this extends kprobes to be able to set probepoints
in user space. Note probes are tracked by inode and offset so that they are
global and relative to a module. This distinguishes kprobes user-space
probes from ptrace implemented breakpoints.


dprobes
-------
The four kprobes patches is almost equivalent to dprobes. Dprobes provides
a generic RPN interpreter in which to define probe handler actions. We
decided that RPN interpreter should be separated out from the breakpointing
mechanism. It's just an example probe handler and can exist outside the
kernel. Also having a set of callable interfaces is more flexible than just
having an RPN language to define probe handlers. The dprobes project has
evolved in to kprobes + a sample device driver that provide the generic RPN
probe handler.

kernel hooks
------------
This is nothing more than a call-back mechanism such as could be used by
LSM or LTT. The call-backs have to be statically coded into the source
unlike kprobes where the call-back to a probe handler is implemented via a
debug interrupt from a watchpoint or dynamically implanted INT3. We created
kernel hooks for exactly the same reasons that LSM needs hooks - to allow
ancillary function to exist outside the kernel, to avoid kernel bloat, to
allow more than one function to be called from a given call-back (think of
kdb and kprobes - both need to be called from do_debug).

Yes both kprobes and kernel hooks implement call-backs, but using INT3 to
call functions is not the most efficient call mechanism, whereas implanting
call back dynamically for debugging purposes is a tad more difficult if
done by patching in a jmp or call instruction.

It's a case of horses for courses. kprobes is a debugging facility; kernel
hooks is a static call-back mechanism.

Richard

2002-10-22 23:49:18

by Greg KH

[permalink] [raw]
Subject: Re: 2.4 Ready list - Kernel Hooks

On Wed, Oct 23, 2002 at 12:09:38AM +0100, Richard J Moore wrote:
> We created
> kernel hooks for exactly the same reasons that LSM needs hooks - to allow
> ancillary function to exist outside the kernel, to avoid kernel bloat, to
> allow more than one function to be called from a given call-back (think of
> kdb and kprobes - both need to be called from do_debug).

No, that is NOT the same reason LSM needs hooks! LSM hooks are there to
mediate access to various kernel objects, from within the kernel itself.
Please do not confuse LSM with any of the above projects.

thanks,

greg k-h

2002-10-23 08:10:35

by Richard J Moore

[permalink] [raw]
Subject: Re: 2.4 Ready list - Kernel Hooks


>On Wed, Oct 23, 2002 at 12:09:38AM +0100, Richard J Moore wrote:
>> We created
>> kernel hooks for exactly the same reasons that LSM needs hooks - to
allow
>> ancillary function to exist outside the kernel, to avoid kernel bloat,
to
>> allow more than one function to be called from a given call-back (think
of
>> kdb and kprobes - both need to be called from do_debug).
>
>No, that is NOT the same reason LSM needs hooks! LSM hooks are there to
>mediate access to various kernel objects, from within the kernel itself.
>Please do not confuse LSM with any of the above projects.
>
>thanks,
>
>greg k-h

I would have to understand what you meant by "mediate between various
kernel objects" to know whether LSM's need for hooks is radically different
to RAS needs. Can you explain further?


Richard

2002-10-23 15:22:46

by Werner Almesberger

[permalink] [raw]
Subject: Re: 2.4 Ready list - Kernel Hooks

Richard J Moore wrote:
> This is nothing more than a call-back mechanism such as could be used by
> LSM or LTT.

Hmm, Greg has already voiced some violent disagreement regarding
LSM :-) That leaves LTT. Given the more exploratory nature of LTT,
I wonder if [dk]probes wouldn't be quite sufficient there, too.

Is the idea that people would deploy hooks locally, i.e. while
profiling or debugging, or that some hooks would be put permanently
in the kernel ? I can envision some rather nasty coding habits
developing if the latter would be used extensively. (INTERCAL has
"COME FROM", COBOL has "ALTER", ... ;-)

By the way, those hooks look like an excellent mechanism for
circumventing the GPL, so you might want to export them with
EXPORT_SYMBOL_GPL.

> Yes both kprobes and kernel hooks implement call-backs, but using INT3 to
> call functions is not the most efficient call mechanism,

Oh, you could probably have some "fast" probes by just checking
for a certain "anchor" pattern (e.g. a sequence of 5 nops on
i386), which could then be replaced with a direct call. This
optimization would have to be optional, in case some code yields
the anchor pattern such that it isn't also a basic block.

Hooks would still have the advantage of easier access to local
variables, of course.

- Werner

--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina [email protected] /
/_http://www.almesberger.net/____________________________________________/

2002-10-23 16:10:46

by Karim Yaghmour

[permalink] [raw]
Subject: Re: 2.4 Ready list - Kernel Hooks


Werner Almesberger wrote:
> Richard J Moore wrote:
> > This is nothing more than a call-back mechanism such as could be used by
> > LSM or LTT.
>
> Hmm, Greg has already voiced some violent disagreement regarding
> LSM :-) That leaves LTT. Given the more exploratory nature of LTT,
> I wonder if [dk]probes wouldn't be quite sufficient there, too.

The whole point of tracing is that the system's behavior should not
be modified but only recorded. Generating int3 won't do.

> Oh, you could probably have some "fast" probes by just checking
> for a certain "anchor" pattern (e.g. a sequence of 5 nops on
> i386), which could then be replaced with a direct call. This
> optimization would have to be optional, in case some code yields
> the anchor pattern such that it isn't also a basic block.

If I remember correctly, the optimized arch-dependent code in kernel
hooks uses "compare immediate" and the value of the immediate is
edited to enable/disable hooking. Given modern branch-prediction the
cost should be quite close to an unconditional jump.

Karim

===================================================
Karim Yaghmour
[email protected]
Embedded and Real-Time Linux Expert
===================================================

2002-10-23 16:47:16

by Richard J Moore

[permalink] [raw]
Subject: Re: 2.4 Ready list - Kernel Hooks


> Is the idea that people would deploy hooks locally, i.e. while
> profiling or debugging, or that some hooks would be put permanently
> in the kernel ?

Our principle reasons for using hooks is:

1) We simplify the integration of related facilities that would share a
number of common hook points, e.g. kdb, dprobes, ltt etc
2) We don't bloat the kernel with these feature but still have the ability
to turn them on dynamically when the need (or the pain) is sufficient for
us to do something about it.
2a) we can reduce the overhead of the extra function when dormant to almost
nil if it can be unhooked from the kernel.
3) We used them during development to extricate a function from the kernel
into a loadable module. This avoided many reboots and kernel builds.


>By the way, those hooks look like an excellent mechanism for
>circumventing the GPL, so you might want to export them with
>EXPORT_SYMBOL_GPL.

We already do that.

I don't envisage having an arbitrary set of hook points scattered
throughout the kernel. It's only when, for example, dprobes needed certain
hooks that we added them.



Richard

2002-10-23 17:05:16

by Greg KH

[permalink] [raw]
Subject: Re: 2.4 Ready list - Kernel Hooks

On Wed, Oct 23, 2002 at 09:10:13AM +0100, Richard J Moore wrote:
>
> >On Wed, Oct 23, 2002 at 12:09:38AM +0100, Richard J Moore wrote:
> >> We created
> >> kernel hooks for exactly the same reasons that LSM needs hooks - to allow
> >> ancillary function to exist outside the kernel, to avoid kernel bloat, to
> >> allow more than one function to be called from a given call-back (think of
> >> kdb and kprobes - both need to be called from do_debug).
> >
> >No, that is NOT the same reason LSM needs hooks! LSM hooks are there to
> >mediate access to various kernel objects, from within the kernel itself.
> >Please do not confuse LSM with any of the above projects.
>
> I would have to understand what you meant by "mediate between various
> kernel objects" to know whether LSM's need for hooks is radically different
> to RAS needs. Can you explain further?

Please read the LSM documentation for more information about this. It
can be found in the kernel at:
Documentation/DocBook/lsm.*
and there are a number of USENIX and OLS papers about different aspects
of the project at:
lsm.immunix.org

In the beginning of the LSM project, both the DProbes and LTT groups
came asking that we use their patches to implement LSM. It was
quickly determined that the types of "hooks" these projects offered was
not what the LSM group needed (or wanted). So the current
implementation was developed.

Hope this helps. If you have any further questions, please feel free to
ask (after reading that documentation :)

thanks,

greg k-h

2002-10-23 19:44:56

by Werner Almesberger

[permalink] [raw]
Subject: Re: 2.4 Ready list - Kernel Hooks

Richard J Moore wrote:
>>EXPORT_SYMBOL_GPL.
>
> We already do that.

Oops, missed that one, sorry ! I was looking at the interface
functions, but making the hooks themselves GPL-only is even
better.

> I don't envisage having an arbitrary set of hook points scattered
> throughout the kernel.

Let's hope you're right :-)

But wouldn't a small extension of kprobes get you pretty much
the same functionality/performance:

- at busy attachment points, add a "kprobe anchor", which
translates to five NOPs [1,2], preceded by a global symbol
- when setting a kprobe, check if the five bytes starting
at p->addr are NOPs [3]
- if yes, insert a call to kprobes_fastpath. if not, use
the current double breakpoint mechanism
- kprobes_fastpath can just return to the caller, no code
modification or single-stepping required

[1] Assuming i386.
[2] Or any sufficiently unlikely sequence of instructions that
executes faster than NOPs.
[3] Or some other pattern - but a quick look at the kernel binary
suggests that all strings of five or more NOPs are used for
padding between function, so it would be safe to assume that
any such sequence is a basic block.

The advantage over hooks would be that users of this mechanism
wouldn't have to choose between fast but intrusive (hooks) and
slow but flexible (probes).

Now, it's non-trivial to do a "return from caller" with
[kd]probes. I haven't looked at that part yet. Do you have the
infrastructure for this ?

- Werner

--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina [email protected] /
/_http://www.almesberger.net/____________________________________________/

2002-10-24 07:19:17

by Vamsi Krishna S .

[permalink] [raw]
Subject: Re: 2.4 Ready list - Kernel Hooks


On Wed, Oct 23, 2002 at 09:50:15PM +0000, Werner Almesberger wrote:
> Oops, missed that one, sorry ! I was looking at the interface
> functions, but making the hooks themselves GPL-only is even
> better.
>
Yes, I have done that in the latest patch.

> > I don't envisage having an arbitrary set of hook points scattered
> > throughout the kernel.
>
> Let's hope you're right :-)
>
kernelhooks is similar to notifier lists (include/linux/notifier.h),
only much faster when there are no users. This patch does not
add any hooks itself, I am sure placement of each hook will be
critically reviewed.

> But wouldn't a small extension of kprobes get you pretty much
> the same functionality/performance:
>
> <snip nice idea>
>
Yes, this is possible, but I think using hooks is much cleaner.

> The advantage over hooks would be that users of this mechanism
> wouldn't have to choose between fast but intrusive (hooks) and
> slow but flexible (probes).
>
As I see it, hooks should be used for allowing other kernel code
to tap into certain well defined paths in the kernel, say in
trap 3 or trap 1 handlers in the kernel to allow multiple
kernel-level breakpointing tools. Or, certain well defined paths
(potentially fast paths) for traceing purposes, where it is
necessary to ensure that for the most time there are no users
of these hooks and their placement alone should place minimal
overhead.

So, hooks are designed, placed at well thought-out locations.
Probes OTOH are mostly ad-hoc. While debugging a problem, if
you find the need to probe a specific code location for more
info, put a probe there, on the fly, with out going through
the recompile and reboot cycle.

> Now, it's non-trivial to do a "return from caller" with
> [kd]probes. I haven't looked at that part yet. Do you have the
> infrastructure for this ?
>
No, returning from caller will be much harder with [kd]probes.

Hope this clarifies the issue.

Thanks,
Vamsi.
--
Vamsi Krishna S.
Linux Technology Center,
IBM Software Lab, Bangalore.
Ph: +91 80 5044959
Internet: [email protected]

2002-10-24 16:41:17

by Richard J Moore

[permalink] [raw]
Subject: Re: 2.4 Ready list - Kernel Hooks


Greg KH wrote:
>Please read the LSM documentation for more information about this. It
>can be found in the kernel at:
> Documentation/DocBook/lsm.*
>and there are a number of USENIX and OLS papers about different aspects
>of the project at:


Thanks Greg. I'll check out the doc. I do remember posting the LSM mailing
list about kernel hooks, but as I recall there was no response. I assumed
that the hooking mechanism was not the focus of attention - that was
18months ago. A few weeks ago Suparna told me LSM had been enquiring about
kernel hooks - never heard the outcome though.

Richard



2002-10-24 16:57:51

by Greg KH

[permalink] [raw]
Subject: Re: 2.4 Ready list - Kernel Hooks

On Thu, Oct 24, 2002 at 05:38:12PM +0100, Richard J Moore wrote:
>
> A few weeks ago Suparna told me LSM had been enquiring about kernel
> hooks - never heard the outcome though.

Wrong type of "hooks". Ours would not work for what you are stating you
need to do, sorry.

thanks,

greg k-h

2002-10-24 17:16:08

by Werner Almesberger

[permalink] [raw]
Subject: Re: 2.4 Ready list - Kernel Hooks

Vamsi Krishna S . wrote:
> So, hooks are designed, placed at well thought-out locations.
> Probes OTOH are mostly ad-hoc.

Yes, my point was that the same (general) mechanism should be
suitable for both types of use. However, ...

>> [kd]probes. I haven't looked at that part yet. Do you have the
>> infrastructure for this ?
>>
> No, returning from caller will be much harder with [kd]probes.

... this seems to kill my grand unified hook/probe theory :-(

- Werner

--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina [email protected] /
/_http://www.almesberger.net/____________________________________________/

2002-10-25 10:11:46

by Suparna Bhattacharya

[permalink] [raw]
Subject: Re: 2.5 Ready list - Kernel Hooks

On Thu, Oct 24, 2002 at 05:06:44PM +0000, Greg KH wrote:
> On Thu, Oct 24, 2002 at 05:38:12PM +0100, Richard J Moore wrote:
> >
> > A few weeks ago Suparna told me LSM had been enquiring about kernel
> > hooks - never heard the outcome though.
>
> Wrong type of "hooks". Ours would not work for what you are stating you
> need to do, sorry.

Backing up a bit, to the more basic question raised briefly
at the kernel summit. Does it make sense to use a common underlying
hooking mechanism for the various kinds of subsystems
that use some form of hooks ? Or are the requirements radically
different enough that it is better for each to devise its own ?

The main advantage of a common mechanism (needn't necessarily be
kernel hooks, by the way) is scope for single source optimization
(including arch specific optimizations where feasible) and a
common scheme of management. These are reasons why LTT, for
example might make use kernel hooks.

The downside of course is that one solution may not suit all,
and in some cases (where the above aspects are not critical)
people might prefer as a matter of taste to have explicit subsystem
specific calls that clearly indicate the kind of component using the
hooks. (Am wondering if this is one of the reasons why LSM
would prefer not to link up with kernel hooks. Is that it ?)

A few clarifications with regard to Kernel hooks, Dprobes and LSM.

To reiterate Vamsi's point, Dprobes and Kernel Hooks are different.
Dprobes is meant for probing on demand and based on a breakpointing
mechanism, and yes, it isn't at all meant for the kinds of things
that LSM is doing.

Kernel hooks are really more like notifiers meant for fast or
frequently accessed paths, optimized for minimal overhead when
the hooks are not active. These hooks can be placed at any code
location and the hook operation can be passed local variables
directly as arguments (no need to build up a structure etc).

The component which registers operations could be part of the
kernel or in a GPL'ed kernel module. Just like notifiers, the call
to the hooks has to be there beforehand in the kernel at necessary
locations.

So this is really very similar in effect to invoking a function
pointer (from a vector of hook operations), with a desired set of
parameters. The main difference is in the mechanism underneath
in terms of how it performs when the operations are dummy (i.e.
not really active).

Given this, situations where one might investigate the kernel
hooks option:
1. Hooks/function pointers called in frequently hit paths
2. Hook operations that may largely be dummy/dormant in most
typical situations

BTW, if the function pointers are context based (i.e. object
specific where the object is a runtime parameter) then one
couldn't use kernel location hooks (LSM security operations
are in their own table though, not object specific like inode
operations, right ?)

At Ottawa, when LSM was being presented there was some mention
of optimized hooking. If I recall correctly the security_ops
vector had around 60+ operations/hooks and it did seem like
a given security module might not be using all the hooks
(capability.c probably uses only one-fifth of the hooks).

My perspective on LSM is limited, so could you tell me if this
is the common pattern on most linux installations i.e. a large
percentage of the hooks being inactive/dummy (condition 2
above) ? Or do you expect all the 68 hooks to be in-use in
general by some module or the other ?

Richard,
IIRC Chris Wright had (shortly after the kernel summit)
asked me for a reference to the kernel hooks site, but we
really didn't get to discuss anything subsequently.

Now, looking at the security hook operations themselves, the
degree of optimization that kernel hooks try to provide
appears to be unnecessary for all the security hooks
e.g. some of the operations being mediated are more
administrative in nature and do not happen frequently, and
it is likely that many of the other hooks may not be very
time critical (could someone confirm if this is correct ?).
So kernel hooks could potentially be used AFAICT (I could be
missing something, of course), at the same time, I'm not sure
if condition 2 above applies in this case.

One situation where I see an important difference is in the
stacking of security modules - kernel hooks do stacking only on
a per-hook basis, while for LSM this needs to be done on a
group of hooks owned by a module. Also the LSM design seems to
leave the implementation of stacking to the discretion of the
active security module.

Regards
Suparna

>
> thanks,
>
> greg k-h
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
Suparna Bhattacharya ([email protected])
Linux Technology Center
IBM Software Labs, India

2002-10-31 07:27:21

by Greg KH

[permalink] [raw]
Subject: Re: 2.5 Ready list - Kernel Hooks

On Fri, Oct 25, 2002 at 03:49:22PM +0530, Suparna Bhattacharya wrote:
>
> The downside of course is that one solution may not suit all,
> and in some cases (where the above aspects are not critical)
> people might prefer as a matter of taste to have explicit subsystem
> specific calls that clearly indicate the kind of component using the
> hooks. (Am wondering if this is one of the reasons why LSM
> would prefer not to link up with kernel hooks. Is that it ?)

Yes, that is one of the main reasons LSM doesn't want to use such a
mechanism. A simple, explicit, function call is fine for what we need
to do.

thanks,

greg k-h

2002-10-31 07:33:26

by Greg KH

[permalink] [raw]
Subject: Re: 2.5 Ready list - Kernel Hooks

Oops, forgot to answer your other questions...

On Fri, Oct 25, 2002 at 03:49:22PM +0530, Suparna Bhattacharya wrote:
>
> So this is really very similar in effect to invoking a function
> pointer (from a vector of hook operations), with a desired set of
> parameters. The main difference is in the mechanism underneath
> in terms of how it performs when the operations are dummy (i.e.
> not really active).
>
> Given this, situations where one might investigate the kernel
> hooks option:
> 1. Hooks/function pointers called in frequently hit paths
> 2. Hook operations that may largely be dummy/dormant in most
> typical situations
>
> BTW, if the function pointers are context based (i.e. object
> specific where the object is a runtime parameter) then one
> couldn't use kernel location hooks (LSM security operations
> are in their own table though, not object specific like inode
> operations, right ?)

Yes they are in their own table.

> At Ottawa, when LSM was being presented there was some mention
> of optimized hooking. If I recall correctly the security_ops
> vector had around 60+ operations/hooks and it did seem like
> a given security module might not be using all the hooks
> (capability.c probably uses only one-fifth of the hooks).
>
> My perspective on LSM is limited, so could you tell me if this
> is the common pattern on most linux installations i.e. a large
> percentage of the hooks being inactive/dummy (condition 2
> above) ? Or do you expect all the 68 hooks to be in-use in
> general by some module or the other ?

I've created simple, useful modules that only use one hook. But SELinux
uses allmost all of them. So there is no simple answer for this
question, sorry.

> Now, looking at the security hook operations themselves, the
> degree of optimization that kernel hooks try to provide
> appears to be unnecessary for all the security hooks
> e.g. some of the operations being mediated are more
> administrative in nature and do not happen frequently, and
> it is likely that many of the other hooks may not be very
> time critical (could someone confirm if this is correct ?).

Many of the LSM hooks are _very_ time critical (every read() call for
example.)

> So kernel hooks could potentially be used AFAICT (I could be
> missing something, of course), at the same time, I'm not sure
> if condition 2 above applies in this case.
>
> One situation where I see an important difference is in the
> stacking of security modules - kernel hooks do stacking only on
> a per-hook basis, while for LSM this needs to be done on a
> group of hooks owned by a module. Also the LSM design seems to
> leave the implementation of stacking to the discretion of the
> active security module.

Yes, that is a place of active disagreement among the different LSM
developers. Some people like it, others don't. Right now the current
implementation allows you to stack LSM modules if you want to, but you
are not forced to if you don't.

Also, I really don't like the use of the term "hook" for describing the
LSM functions, as you can see how people get confused about this :)
Some other term, like "callback" might be a better term.

thanks,

greg k-h