2009-03-02 07:30:08

by Tom Zanussi

[permalink] [raw]
Subject: Re: [PATCH 2/4] zedtrace generic kernel filtering


On Sat, 2009-02-28 at 10:26 +0100, Ingo Molnar wrote:
> * Tom Zanussi <[email protected]> wrote:
>
> > Add generic kernel filtering.
> >
> > Signed-off-by: Tom Zanussi <[email protected]>
> >
> > ---
> > kernel/trace/trace_binary/Makefile | 2 +-
> > kernel/trace/trace_binary/zed.c | 103 +++++++++--
> > kernel/trace/trace_binary/zed.h | 15 ++
> > kernel/trace/trace_binary/zed_filter.c | 301 ++++++++++++++++++++++++++++++++
> > kernel/trace/trace_binary/zed_filter.h | 45 +++++
> > kernel/trace/trace_binary/zed_sched.c | 4 +
> > 6 files changed, 451 insertions(+), 19 deletions(-)
> > create mode 100644 kernel/trace/trace_binary/zed_filter.c
> > create mode 100644 kernel/trace/trace_binary/zed_filter.h
>
> Nice!
>

Thanks!

> This fits really nicely into the ftrace principles and i'd love
> to see this feature merged into ftrace - would you be interested
> in working on that? If so then you can find the latest tracing

Sure, but I'm not too familiar with the ftrace code and wouldn't have
big blocks of time to devote to it, so if it's something that's needed
"right away", I'll probably have to defer to letting someone else adapt
the code if they wanted to.

> tree at:
>
> http://people.redhat.com/mingo/tip.git/README
>
> Note that Steve added explicit field enumeration and 'raw' C
> syntax tracepoints to the event tracer earlier today (partly
> based on your ideas here), so that would be a good basis to
> extend/enhance/fix, if you are interested.
>

Yeah, I took a quick look and saw some nice improvements. Anyway, the
filtering I did for this was basically a side-effect of the event
description stuff, which made the filtering relatively easy to do (and
the event description files give the user a way to list the available
fields). What I'm wondering is if you're interested in the filtering
part alone or in the event description part as well, which I hadn't
thought of as separable (I guess I need to look at the current ftrace
code to see what's already there).

Thanks,

Tom

> Thanks,
>
> Ingo


2009-03-02 10:59:18

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2/4] zedtrace generic kernel filtering


* Tom Zanussi <[email protected]> wrote:

> > Note that Steve added explicit field enumeration and 'raw' C
> > syntax tracepoints to the event tracer earlier today (partly
> > based on your ideas here), so that would be a good basis to
> > extend/enhance/fix, if you are interested.
>
> Yeah, I took a quick look and saw some nice improvements.

:)

> Anyway, the filtering I did for this was basically a
> side-effect of the event description stuff, which made the
> filtering relatively easy to do (and the event description
> files give the user a way to list the available fields). What
> I'm wondering is if you're interested in the filtering part
> alone or in the event description part as well, which I hadn't
> thought of as separable (I guess I need to look at the current
> ftrace code to see what's already there).

No, not filtering alone - event description / field enumeration
part is mandatory for user-space to be able to define filters,
so yes, that bit is also needed and desired. Steve already added
those bits we just dont yet have them exposed in
/debug/tracing/events, like your patch does. (I think it's next
on Steve's TODO list.)

Basically, i think the big picture is the following. The best
model for tracepoints is for them to have the following life
cycle:

- trace_printk() ad-hoc additions. Not stable, not hookable and
not enumerated - but highly convenient.

- if a trace_printk() turns out to be useful it might become a
bit more active and turn into a regular tracepoint. This
makes it hookable by ftrace plugins and makes it faster - but
it's not generally enumerated yet.

- the final stage for a tracepoint is for it to become a
"C-style" tracepoint. That makes it generally available to
all ftrace plugins, makes it available to opaque user-space
consumption as well and all fields are enumerated. The
in-kernel value filtering machinery you added can make use of
them as well.

( The downside is (and there are always downsides ;-) that
such tracepoints are the hardest to add and have the
highest ongoing maintenance overhead - but that aspect is
easily visible and will be a well understood property of
them. )

Most tracepoints would move on the most convenient-to-add first
two levels - but eventually some would percolate up to the last
stage as well.

I think the ones you've identified in your patchset are good
candidates for that final stage already - and we've added a few
more too, such as the IRQ entry/exit tracepoints.

Ingo