2006-05-16 17:46:46

by Martin Peschke

[permalink] [raw]
Subject: [RFC] [Patch 0/8] statistics infrastructure

(This is a sequel. What happened in the last season:
http://marc.theaimsgroup.com/?l=linux-kernel&m=113458576022747&w=2)

My patch series is a proposal for a generic implementation of statistics.
Envisioned exploiters include device drivers, and any other component.
It provides both a unified programming interface for exploiters as well
as a unified user interface. It comes with a set of disciplines that
implement various ways of data processing, like counters and histograms.

The recent rework addresses performance issues and memory footprint,
straightens some concepts out, streamlines the programming interface,
removes some weiredness from the user interface, reduces the amount of
code, and moves the exploitation according to last time's feedback.

A few more keywords for the reader's convenience:
based on per-cpu data; spinlock-free protection of data; observes
cpu-hot(un)plug for efficient memory use; tiny state machine for
switching-on, switching-off, releasing data etc.; configurable by users
at run-time; still sitting in debugfs; simple addition of other disciplines.

Good places to start reading code are:

statistic_create(), statistic_remove()
statistic_add(), statistic_inc()
struct statistic_interface, struct statistic
struct statistic_discipline, statistic_*_counter()
statistic_transition()

I'd suggest you skip anything that looks like string manipulation, and
have a look at my humble attempt at a user interface once you are
familiar with the base function.

Looking forward to your comments.

Martin



2006-05-17 17:24:20

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: [RFC] [Patch 0/8] statistics infrastructure


Martin Peschke <[email protected]> writes:

> My patch series is a proposal for a generic implementation of statistics.
> Envisioned exploiters include device drivers, and any other component.
> [...]
> Good places to start reading code are:
> statistic_create(), statistic_remove()
> statistic_add(), statistic_inc()
> [...]

It is interesting how many solutions pop up for this sort of problem.
The many tracing tools/patches, systemtap, and now this, all share
some goals and should ideally share some of the technology.

In particular, one of the common points is the designation of points
where significant events take place, and passing their parameters. In
your case, these are the statitistic_add/inc() calls. In LTT, these
are macros or inline functions expanding to tracing calls. In
systemtap, ignoring the slower dynamic kprobes, we now have prototype
support for "markers" are generic statically placed hooks that may be
bound to arbitrary instrumentation code. (I will be talking more
about this at OLS.)
<http://sourceware.org/ml/systemtap/2006-q1/msg00901.html>

It would be nice if we found a way to agree on one single hooking
mechanism, one that could be accepted here upstream, and used by all
these various projects for their own tracing, probing, or
statistics-collecting backends.

- FChE

2006-05-17 18:06:06

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC] [Patch 0/8] statistics infrastructure

On Wednesday 17 May 2006 19:23, Frank Ch. Eigler wrote:
>
> Martin Peschke <[email protected]> writes:
>
> > My patch series is a proposal for a generic implementation of statistics.
> > Envisioned exploiters include device drivers, and any other component.
> > [...]
> > Good places to start reading code are:
> > statistic_create(), statistic_remove()
> > statistic_add(), statistic_inc()
> > [...]
>
> It is interesting how many solutions pop up for this sort of problem.
> The many tracing tools/patches, systemtap, and now this, all share
> some goals and should ideally share some of the technology.

I disagree. They often have very different requirements - and a one-size-fits-all
solution will be likely too heavyweight for most users.

The passing to user space can be unified, but we already have solutions
for that (seq_*, relayfs). But the actual data gathering is better custom
tailored.

-Andi

2006-05-17 18:28:26

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: [RFC] [Patch 0/8] statistics infrastructure

Hi -

On Wed, May 17, 2006 at 08:05:43PM +0200, Andi Kleen wrote:
> [...]
> > It is interesting how many solutions pop up for this sort of problem.
> > The many tracing tools/patches, systemtap, and now this, all share
> > some goals and should ideally share some of the technology.
>
> I disagree. They often have very different requirements - and a
> one-size-fits-all solution will be likely too heavyweight for most
> users.

I am not suggesting a single solution for all needs. I wanted to
focus only one aspect: the marking of those points in the kernel where
something probeworthy occurs with hooks. The different tools would
still gather and disseminate their data in their own favorite. The
main difference from the status quo is agreeing on and reusing a
common pool of hooks.

- FChE

2006-05-17 18:45:37

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [RFC] [Patch 0/8] statistics infrastructure

On Wed, 17 May 2006 14:28:08 EDT, "Frank Ch. Eigler" said:
> I am not suggesting a single solution for all needs. I wanted to
> focus only one aspect: the marking of those points in the kernel where
> something probeworthy occurs with hooks. The different tools would
> still gather and disseminate their data in their own favorite. The
> main difference from the status quo is agreeing on and reusing a
> common pool of hooks.

The problem is that the "common pool" ends up being a very wide swamp
very fast. The last few times I've needed any instrumentation in the
kernel, I was chasing slab leaks, and didn't need precise timing or
latency measurements. On the other hand, the RT guys probably don't
care all that much about slab events, but need timing and latency.
Then there's other guys that don't care about slab, timing, or latency,
but do care about some other events.

So under your plan, all 3 groups now use a "common pool" that includes
slap, timing, latency, and other stuff - and nobody's using more than
1/3 of it, but paying the performance penalty for the 2/3 unused hooks....



Attachments:
(No filename) (226.00 B)

2006-05-17 18:55:45

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: [RFC] [Patch 0/8] statistics infrastructure

Hi -

On Wed, May 17, 2006 at 02:44:24PM -0400, [email protected] wrote:
> On Wed, 17 May 2006 14:28:08 EDT, "Frank Ch. Eigler" said:
> > I am not suggesting a single solution for all needs. I wanted to
> > focus only one aspect: the marking of those points in the kernel where
> > something probeworthy occurs with hooks. [...]
>
> The problem is that the "common pool" ends up being a very wide swamp
> very fast. [...]
> So under your plan, all 3 groups now use a "common pool" that includes
> slap, timing, latency, and other stuff - and nobody's using more than
> 1/3 of it, but paying the performance penalty for the 2/3 unused hooks....

It may not be clear, but by "pool", I mean some group of individually
activated hooks, doing little but calling some routine of
instrumentation with a few parameters. Special-interest data like
timing, latency would be computed in the instrumentation code, not
necessarily at the hook site, so that part need incur no waste for
disinterested users.

Not-activated (dormant) hooks would indeed cost a little. The
question is how much time/space cost is acceptable, in order to reap
the benefits of widely available probing.

- FChE