2007-10-29 21:51:50

by Mathieu Desnoyers

[permalink] [raw]
Subject: [RFC] Create instrumentation directory (git repository)

Hi,

Since we already have the Instrumentation menu in
kernel/Kconfig.instrumentation and instrumentation code all over the
kernel tree:

arch/*/oprofile/*.c
kernel/kprobes.c
arch/*/kernel/kprobes.c
kernel/marker.c
kernel/profile.c
kernel/lockdep.c
vm/vmstat.c
block/blktrace.c
drivers/base/power/trace.c

We could move them to

instrumentation/
arch/*/instrumentation/

Therefore, we could also move the kprobes and marker samples under

instrumentation/samples/

Here is a link to a git repository containing the changes, based on
2.6.24-rc1:

git://ltt.polymtl.ca/linux-2.6-instrumentation.git instrumentation-for-linus
(the interesting range is : v2.6.24-rc1..instrumentation-for-linus)

Through the gitweb interface:
http://ltt.polymtl.ca/cgi-bin/gitweb.cgi?p=linux-2.6-instrumentation.git

Feedback is appreciated. Sorry for the huge CC list, but the change
involves many maintainers.

Mathieu

--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68


2007-10-29 22:47:54

by Randy Dunlap

[permalink] [raw]
Subject: Re: [RFC] Create instrumentation directory (git repository)

On Mon, 29 Oct 2007 17:51:38 -0400 Mathieu Desnoyers wrote:

> Hi,
>
> Since we already have the Instrumentation menu in
> kernel/Kconfig.instrumentation and instrumentation code all over the
> kernel tree:
>
> arch/*/oprofile/*.c
> kernel/kprobes.c
> arch/*/kernel/kprobes.c
> kernel/marker.c
> kernel/profile.c
> kernel/lockdep.c
> vm/vmstat.c
> block/blktrace.c
> drivers/base/power/trace.c
>
> We could move them to
>
> instrumentation/
> arch/*/instrumentation/
>
> Therefore, we could also move the kprobes and marker samples under
>
> instrumentation/samples/
>
> Here is a link to a git repository containing the changes, based on
> 2.6.24-rc1:
>
> git://ltt.polymtl.ca/linux-2.6-instrumentation.git instrumentation-for-linus
> (the interesting range is : v2.6.24-rc1..instrumentation-for-linus)
>
> Through the gitweb interface:
> http://ltt.polymtl.ca/cgi-bin/gitweb.cgi?p=linux-2.6-instrumentation.git
>
> Feedback is appreciated. Sorry for the huge CC list, but the change
> involves many maintainers.

Two more added. Jeff Garzik and Christoph H. sometimes have some comments
about this.

It would be helpful if we could get comments on this in the next day
or two [instead of in 1-2 weeks].

Thanks,
---
~Randy

2007-10-29 22:54:46

by Jeff Garzik

[permalink] [raw]
Subject: Re: [RFC] Create instrumentation directory (git repository)

Randy Dunlap wrote:
> On Mon, 29 Oct 2007 17:51:38 -0400 Mathieu Desnoyers wrote:
>
>> Hi,
>>
>> Since we already have the Instrumentation menu in
>> kernel/Kconfig.instrumentation and instrumentation code all over the
>> kernel tree:
>>
>> arch/*/oprofile/*.c
>> kernel/kprobes.c
>> arch/*/kernel/kprobes.c
>> kernel/marker.c
>> kernel/profile.c
>> kernel/lockdep.c
>> vm/vmstat.c
>> block/blktrace.c
>> drivers/base/power/trace.c
>>
>> We could move them to
>>
>> instrumentation/
>> arch/*/instrumentation/
>>
>> Therefore, we could also move the kprobes and marker samples under
>>
>> instrumentation/samples/
>>
>> Here is a link to a git repository containing the changes, based on
>> 2.6.24-rc1:
>>
>> git://ltt.polymtl.ca/linux-2.6-instrumentation.git instrumentation-for-linus
>> (the interesting range is : v2.6.24-rc1..instrumentation-for-linus)
>>
>> Through the gitweb interface:
>> http://ltt.polymtl.ca/cgi-bin/gitweb.cgi?p=linux-2.6-instrumentation.git
>>
>> Feedback is appreciated. Sorry for the huge CC list, but the change
>> involves many maintainers.
>
> Two more added. Jeff Garzik and Christoph H. sometimes have some comments
> about this.
>
> It would be helpful if we could get comments on this in the next day
> or two [instead of in 1-2 weeks].

"instrumentation" is long, and painful to the fingers :)

Jeff


2007-10-29 23:04:22

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC] Create instrumentation directory (git repository)

* Jeff Garzik ([email protected]) wrote:
> Randy Dunlap wrote:
> >On Mon, 29 Oct 2007 17:51:38 -0400 Mathieu Desnoyers wrote:
> >
> >>Hi,
> >>
> >>Since we already have the Instrumentation menu in
> >>kernel/Kconfig.instrumentation and instrumentation code all over the
> >>kernel tree:
> >>
> >>arch/*/oprofile/*.c
> >>kernel/kprobes.c
> >>arch/*/kernel/kprobes.c
> >>kernel/marker.c
> >>kernel/profile.c
> >>kernel/lockdep.c
> >>vm/vmstat.c
> >>block/blktrace.c
> >>drivers/base/power/trace.c
> >>
> >>We could move them to
> >>
> >>instrumentation/
> >>arch/*/instrumentation/
> >>
> >>Therefore, we could also move the kprobes and marker samples under
> >>
> >>instrumentation/samples/
> >>
> >>Here is a link to a git repository containing the changes, based on
> >>2.6.24-rc1:
> >>
> >>git://ltt.polymtl.ca/linux-2.6-instrumentation.git
> >>instrumentation-for-linus
> >>(the interesting range is : v2.6.24-rc1..instrumentation-for-linus)
> >>
> >>Through the gitweb interface:
> >>http://ltt.polymtl.ca/cgi-bin/gitweb.cgi?p=linux-2.6-instrumentation.git
> >>
> >>Feedback is appreciated. Sorry for the huge CC list, but the change
> >>involves many maintainers.
> >
> >Two more added. Jeff Garzik and Christoph H. sometimes have some comments
> >about this.
> >
> >It would be helpful if we could get comments on this in the next day
> >or two [instead of in 1-2 weeks].
>
> "instrumentation" is long, and painful to the fingers :)
>

Quoting my post from last week:

> My main concern is that 15 characters long directory name might be
> inelegant (however, it only beats Documentation by 2).

And quoting the answer from [email protected] :
How so? i n s esc. 4 keystrokes (and still 2 more than D<ESC> ;)



Better suggestions are wery welcome. However, in modern shells,
auto-completion is cheap nowadays.

Mathieu


--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2007-10-29 23:08:54

by Jeff Garzik

[permalink] [raw]
Subject: Re: [RFC] Create instrumentation directory (git repository)

Mathieu Desnoyers wrote:
> * Jeff Garzik ([email protected]) wrote:
>> Randy Dunlap wrote:
>>> On Mon, 29 Oct 2007 17:51:38 -0400 Mathieu Desnoyers wrote:
>>>
>>>> Hi,
>>>>
>>>> Since we already have the Instrumentation menu in
>>>> kernel/Kconfig.instrumentation and instrumentation code all over the
>>>> kernel tree:
>>>>
>>>> arch/*/oprofile/*.c
>>>> kernel/kprobes.c
>>>> arch/*/kernel/kprobes.c
>>>> kernel/marker.c
>>>> kernel/profile.c
>>>> kernel/lockdep.c
>>>> vm/vmstat.c
>>>> block/blktrace.c
>>>> drivers/base/power/trace.c
>>>>
>>>> We could move them to
>>>>
>>>> instrumentation/
>>>> arch/*/instrumentation/
>>>>
>>>> Therefore, we could also move the kprobes and marker samples under
>>>>
>>>> instrumentation/samples/
>>>>
>>>> Here is a link to a git repository containing the changes, based on
>>>> 2.6.24-rc1:
>>>>
>>>> git://ltt.polymtl.ca/linux-2.6-instrumentation.git
>>>> instrumentation-for-linus
>>>> (the interesting range is : v2.6.24-rc1..instrumentation-for-linus)
>>>>
>>>> Through the gitweb interface:
>>>> http://ltt.polymtl.ca/cgi-bin/gitweb.cgi?p=linux-2.6-instrumentation.git
>>>>
>>>> Feedback is appreciated. Sorry for the huge CC list, but the change
>>>> involves many maintainers.
>>> Two more added. Jeff Garzik and Christoph H. sometimes have some comments
>>> about this.
>>>
>>> It would be helpful if we could get comments on this in the next day
>>> or two [instead of in 1-2 weeks].
>> "instrumentation" is long, and painful to the fingers :)
>>
>
> Quoting my post from last week:
>
>> My main concern is that 15 characters long directory name might be
>> inelegant (however, it only beats Documentation by 2).
>
> And quoting the answer from [email protected] :
> How so? i n s esc. 4 keystrokes (and still 2 more than D<ESC> ;)
>
>
>
> Better suggestions are wery welcome. However, in modern shells,
> auto-completion is cheap nowadays.

That is no excuse for extreme verbosity. It makes ls(1) displays ugly,
it makes diffstat ugly, it causes long pathnames to be truncated in
various display-oriented programs.

Pick a shorter word like probes or profile or what... or better yet...
just leave most things in their current directories.

Shuffling files around just to put them into directories with extra-long
names is highly undesirable.

Jeff


2007-10-29 23:20:25

by Christoph Lameter

[permalink] [raw]
Subject: Re: [RFC] Create instrumentation directory (git repository)

On Mon, 29 Oct 2007, Mathieu Desnoyers wrote:

> vm/vmstat.c

The vm statistics are important for the operation of the VM. They are not
optional. So I do not think that they fall under the category of
instrumentation.

2007-10-29 23:45:28

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC] Create instrumentation directory (git repository)

* Christoph Lameter ([email protected]) wrote:
> On Mon, 29 Oct 2007, Mathieu Desnoyers wrote:
>
> > vm/vmstat.c
>
> The vm statistics are important for the operation of the VM. They are not
> optional. So I do not think that they fall under the category of
> instrumentation.

But I guess vm stats can be useful to others; a kernel tracer for
instance ?

Putting stuff in instrumentation/ by no way means that it becomes
optional for a subsystem, but merely that it could either export
information useful for kernel instrumentation or have some
infrastructure parts merged with others.

Mathieu

--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2007-10-29 23:45:42

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC] Create instrumentation directory (git repository)

* Jeff Garzik ([email protected]) wrote:
> Mathieu Desnoyers wrote:
> >* Jeff Garzik ([email protected]) wrote:
> >>Randy Dunlap wrote:
> >>>On Mon, 29 Oct 2007 17:51:38 -0400 Mathieu Desnoyers wrote:
> >>>
> >>>>Hi,
> >>>>
> >>>>Since we already have the Instrumentation menu in
> >>>>kernel/Kconfig.instrumentation and instrumentation code all over the
> >>>>kernel tree:
> >>>>
> >>>>arch/*/oprofile/*.c
> >>>>kernel/kprobes.c
> >>>>arch/*/kernel/kprobes.c
> >>>>kernel/marker.c
> >>>>kernel/profile.c
> >>>>kernel/lockdep.c
> >>>>vm/vmstat.c
> >>>>block/blktrace.c
> >>>>drivers/base/power/trace.c
> >>>>
> >>>>We could move them to
> >>>>
> >>>>instrumentation/
> >>>>arch/*/instrumentation/
> >>>>
> >>>>Therefore, we could also move the kprobes and marker samples under
> >>>>
> >>>>instrumentation/samples/
> >>>>
> >>>>Here is a link to a git repository containing the changes, based on
> >>>>2.6.24-rc1:
> >>>>
> >>>>git://ltt.polymtl.ca/linux-2.6-instrumentation.git
> >>>>instrumentation-for-linus
> >>>>(the interesting range is : v2.6.24-rc1..instrumentation-for-linus)
> >>>>
> >>>>Through the gitweb interface:
> >>>>http://ltt.polymtl.ca/cgi-bin/gitweb.cgi?p=linux-2.6-instrumentation.git
> >>>>
> >>>>Feedback is appreciated. Sorry for the huge CC list, but the change
> >>>>involves many maintainers.
> >>>Two more added. Jeff Garzik and Christoph H. sometimes have some
> >>>comments
> >>>about this.
> >>>
> >>>It would be helpful if we could get comments on this in the next day
> >>>or two [instead of in 1-2 weeks].
> >>"instrumentation" is long, and painful to the fingers :)
> >>
> >
> >Quoting my post from last week:
> >
> >>My main concern is that 15 characters long directory name might be
> >>inelegant (however, it only beats Documentation by 2).
> >
> >And quoting the answer from [email protected] :
> >How so? i n s esc. 4 keystrokes (and still 2 more than D<ESC> ;)
> >
> >
> >
> >Better suggestions are wery welcome. However, in modern shells,
> >auto-completion is cheap nowadays.
>
> That is no excuse for extreme verbosity. It makes ls(1) displays ugly,
> it makes diffstat ugly, it causes long pathnames to be truncated in
> various display-oriented programs.
>
> Pick a shorter word like probes or profile or what... or better yet...
> just leave most things in their current directories.
>
> Shuffling files around just to put them into directories with extra-long
> names is highly undesirable.
>
>

I'll keep the probes and profile directory name ideas in mind, thanks.

This patchset does more than moving things around : its purpose is to
gather various kernel files that have similar purpose (instrumentation)
into a single directory so that it becomes easier to work on these
without duplicating the effort.

I see no good reason to have so many different adhoc instrumentation
mechanisms for profiling (sched, vm, oprofile) and tracing (blktrace,
suspend/resume tracing) all over the place. Merging them in a single
directory seems like a good step towards a more generic
instrumentation/profiling/tracing infrastructure.

Back to "profile" and "probes" directory names, they might be short, but
they do not represent the whole markup-profiling-tracing trio,
"profile" lacks the tracing part and "probe" lacks the markup part.

Mathieu

--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2007-10-29 23:45:58

by Christoph Lameter

[permalink] [raw]
Subject: Re: [RFC] Create instrumentation directory (git repository)

On Mon, 29 Oct 2007, Mathieu Desnoyers wrote:

> * Christoph Lameter ([email protected]) wrote:
> > On Mon, 29 Oct 2007, Mathieu Desnoyers wrote:
> >
> > > vm/vmstat.c
> >
> > The vm statistics are important for the operation of the VM. They are not
> > optional. So I do not think that they fall under the category of
> > instrumentation.
>
> But I guess vm stats can be useful to others; a kernel tracer for
> instance ?

Yes.

> Putting stuff in instrumentation/ by no way means that it becomes
> optional for a subsystem, but merely that it could either export
> information useful for kernel instrumentation or have some
> infrastructure parts merged with others.

The vm statistics are intricately connected with other mm code. Best leave
it where it is. The other instrumentation is something that is put in
particularly for gaining statistics.



2007-10-30 00:47:44

by Jeff Garzik

[permalink] [raw]
Subject: Re: [RFC] Create instrumentation directory (git repository)

Mathieu Desnoyers wrote:
> I see no good reason to have so many different adhoc instrumentation
> mechanisms for profiling (sched, vm, oprofile) and tracing (blktrace,
> suspend/resume tracing) all over the place. Merging them in a single
> directory seems like a good step towards a more generic
> instrumentation/profiling/tracing infrastructure.

Moving files about in directories should be at the /lowest/ end of the
priority scale. It makes diffs unreadable, file histories and diffing
difficult, and a host of other problems.

Please solve the /real/ problems, and then come back and clean up the
file structure after that is done. Massive file renaming to satisfying
some imagined future everything-is-golden scheme is the /last/ step. It
is the last step taken because the previous steps inevitably give you
guidance that you otherwise would not have had at the start of the task.

When I try to diff between old and new alpha oprofile code, I really
want to know that the reason why diffing is a pain in the ass is more
than "it seemed like a good first step."


> Back to "profile" and "probes" directory names, they might be short, but
> they do not represent the whole markup-profiling-tracing trio,
> "profile" lacks the tracing part and "probe" lacks the markup part.

You can always add more letters (and words) to even reach the desired
level of specificity. That does nothing to help readability though.

Anyway, it should be clear from existing precedent -- existing pathnames
-- that "instrumentation" is too long, and really IMO too vague anyway.

Jeff


2007-10-30 00:51:28

by Jeff Garzik

[permalink] [raw]
Subject: Re: [RFC] Create instrumentation directory (git repository)

Mathieu Desnoyers wrote:
> Putting stuff in instrumentation/ by no way means that it becomes
> optional for a subsystem, but merely that it could either export
> information useful for kernel instrumentation or have some
> infrastructure parts merged with others.

More reason why you should not be moving stuff all around the tree...

Really, file structure is one of the LEAST important issues around --
while moving files around introduces a non-zero amount of pain.

New files -- like that godawful and nearly empty samples/ directory --
sure, fix that up before release. But let's not break diffs of existing
architectures without good reason.

Jeff


2007-10-30 01:43:19

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC] Create instrumentation directory (git repository)

* Jeff Garzik ([email protected]) wrote:
> Mathieu Desnoyers wrote:
> >I see no good reason to have so many different adhoc instrumentation
> >mechanisms for profiling (sched, vm, oprofile) and tracing (blktrace,
> >suspend/resume tracing) all over the place. Merging them in a single
> >directory seems like a good step towards a more generic
> >instrumentation/profiling/tracing infrastructure.
>
> Moving files about in directories should be at the /lowest/ end of the
> priority scale. It makes diffs unreadable, file histories and diffing
> difficult, and a host of other problems.
>
> Please solve the /real/ problems, and then come back and clean up the
> file structure after that is done. Massive file renaming to satisfying
> some imagined future everything-is-golden scheme is the /last/ step. It
> is the last step taken because the previous steps inevitably give you
> guidance that you otherwise would not have had at the start of the task.
>
> When I try to diff between old and new alpha oprofile code, I really
> want to know that the reason why diffing is a pain in the ass is more
> than "it seemed like a good first step."
>

And how is this confirmed by the way the i386-x86_64 -> x86 merge is
done ? It seems like a good current counter-example of what you just
affirmed.

First organizing the functionally similar existing code into a single
placeholder will just help finding code duplication, just like two very
similar architectures such as i386 and x86_64.

Talking about solving "real" problems, this is what I have been working
on for about 3 years in the kernel tracing area, writing the LTTng
tracer. What I see at this point is that there is a strong interest for
collaboration between the instrumentation projects (LTTng, SystemTAP,
DTI), but since the code ends up being sprinkled all across the kernel,
it's rather hard to spot duplicates. Actually, I just ran into Linus's
suspend/resume tracer _today_.

Talking about solving real problems, this is also what I did with the
Linux Kernel Markers patch, which can now be used to instrument the
kernel code. But it only deals with one aspect of instrumentation: the
markup itself.

I would categorize what we need for instrumentation in the following
categories :

- Data identification
* static markup, enabled dynamically, very low impact
* dynamic markup
* oprofile (especially for the performance counters)
* stack traces
- Control
* Tracing management
* Profiling management
* PMC management
- Data extraction
* relay
* debugfs
* serial port output
* LKCD

What I consider to fit into the instrumentation directory is the data
identification and the control mechanisms. The data extraction should be
done be generic pieces of infrastructure already present in the kernel.

Your suggestion of "first fixing the real problems" (do you mean by
this : add new code ?) and later bother about the file structure just
seems to go against most suggestions I have received from kernel
developers in the past years. Getting something new in the kernel is
much more straightforward if someone is willing to first clean up the
mess (I am quoting Thomas Gleixner here). So, in this particular case,
addressing the real problem : people out there want a tracer in the
Linux kernel, will first require to clean up the currently existing
mess : overlapping instrumentation code sprinkled all over the place.


>
> >Back to "profile" and "probes" directory names, they might be short, but
> >they do not represent the whole markup-profiling-tracing trio,
> >"profile" lacks the tracing part and "probe" lacks the markup part.
>
> You can always add more letters (and words) to even reach the desired
> level of specificity. That does nothing to help readability though.
>
> Anyway, it should be clear from existing precedent -- existing pathnames
> -- that "instrumentation" is too long, and really IMO too vague anyway.
>

I guess you don't use the Documentation/ directory often then. ;)

How about i13n ? :) Jokes aside, I could live with "probe", although it
doesn't fit the purpose exactly. Getting the perfect name, to me, come
second after the need to group those files.

Mathieu


--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2007-10-30 09:15:05

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFC] Create instrumentation directory (git repository)

Em Mon, Oct 29, 2007 at 07:35:15PM -0400, Mathieu Desnoyers escreveu:
> I see no good reason to have so many different adhoc instrumentation
> mechanisms for profiling (sched, vm, oprofile) and tracing (blktrace,
> suspend/resume tracing) all over the place. Merging them in a single
> directory seems like a good step towards a more generic
> instrumentation/profiling/tracing infrastructure.
>
> Back to "profile" and "probes" directory names, they might be short, but
> they do not represent the whole markup-profiling-tracing trio,
> "profile" lacks the tracing part and "probe" lacks the markup part.

i14m
hooks

8)

- Arnaldo

2007-10-30 17:52:38

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?



On Tue, 30 Oct 2007, Mathieu Desnoyers wrote:
>
> * Jeff Garzik ([email protected]) wrote:
> ...
> > Pick a shorter word like probes or profile or what... or better yet...
> > just leave most things in their current directories.
> ...
>
>
> How about something along the
>
> kinst or ki
>
> lines ?
>
> (for "kernel instrumentation")

No, that's horrible.

Also, in general, why do people want to have an "instrumentation" thing?
Yes, you can put random things into the same box, but that doesn't make
them be the same thing. Personally, I don't think "instrumentation" is
very useful at all. I consider "profiling" and "markers" to be two
fundamentally different things, and putting them both in the same box does
not make them any more similar.

Yes, technically they are both "instrumentation", but hey, technically the
VM and the VFS layer are both "infrastructure", but we don't put *those*
in a "infrastructure" subdirectory.

In other words, the fact that two different things share some attribute
does not mean that they should be collapsed together by that attribute,
does it?

I think "instrumentation" was/is a particularly bad thing to group things
by. It doesn't actually tell you anything about the thing, and it's not
even true that some people are interested in "instrumentation" and others
aren't.

For example: I think profiling support is something REALLY FUNDAMENTAL.
It's something each and every developer should generally care about, and
OProfile should be considered an indispensable tool for any developer, on
par with something like gdb.

In contrast, we should *not* expect most people to do any kernel markers
etc. That's a very esoteric thing.

So I actually think that the current Kconfig.instrumentation should be
*removed*. Rather than adding more groupings based on that fundamentally
flawed premise of false commonality.

Linus

2007-10-30 17:55:53

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?

On Tue, 2007-10-30 at 13:24 -0400, Mathieu Desnoyers wrote:
> * Jeff Garzik ([email protected]) wrote:
> ...
> > Pick a shorter word like probes or profile or what... or better yet...
> > just leave most things in their current directories.
> ...
>
>
> How about something along the
>
> kinst or ki
>
> lines ?
>
> (for "kernel instrumentation")

I think I'm with jgarzik on this, lets not do this until its clear where
the generalized instrumentation goes to.

That is, i386/x86_64 -> x86 was part of a full integration plan, one
that was immediately followed up by a series of integration patches.

With this, I see no such plan. Please draft this generic instrumentation
you talk about, if after that we all like it, we can go moving files
together with the immediate purpose of integrating them.



2007-10-30 17:59:59

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?

> I think "instrumentation" was/is a particularly bad thing to group things
> by. It doesn't actually tell you anything about the thing, and it's not
> even true that some people are interested in "instrumentation" and others
> aren't.

I completely agree. This is not the kind of thing we need to categorize.

2007-10-30 18:04:54

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?

* Jeff Garzik ([email protected]) wrote:
...
> Pick a shorter word like probes or profile or what... or better yet...
> just leave most things in their current directories.
...


How about something along the

kinst or ki

lines ?

(for "kernel instrumentation")

Mathieu

--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2007-10-30 18:56:51

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?

* Linus Torvalds ([email protected]) wrote:
>
>
> On Tue, 30 Oct 2007, Mathieu Desnoyers wrote:
> >
> > * Jeff Garzik ([email protected]) wrote:
> > ...
> > > Pick a shorter word like probes or profile or what... or better yet...
> > > just leave most things in their current directories.
> > ...
> >
> >
> > How about something along the
> >
> > kinst or ki
> >
> > lines ?
> >
> > (for "kernel instrumentation")
>
> No, that's horrible.
>
> Also, in general, why do people want to have an "instrumentation" thing?
> Yes, you can put random things into the same box, but that doesn't make
> them be the same thing. Personally, I don't think "instrumentation" is
> very useful at all. I consider "profiling" and "markers" to be two
> fundamentally different things, and putting them both in the same box does
> not make them any more similar.
>
> Yes, technically they are both "instrumentation", but hey, technically the
> VM and the VFS layer are both "infrastructure", but we don't put *those*
> in a "infrastructure" subdirectory.
>

The key idea for collapsing profiling, markup and tracing was that
marking up the code is required for both profiling and tracing. It's
only the code that is called from that markup site that differs.

> In other words, the fact that two different things share some attribute
> does not mean that they should be collapsed together by that attribute,
> does it?

It becomes interesting when they can share code and/or a common control
architecture. The fact that markup could be shared between profiling and
tracing could be a good incentive to do so.

>
> I think "instrumentation" was/is a particularly bad thing to group things
> by. It doesn't actually tell you anything about the thing, and it's not
> even true that some people are interested in "instrumentation" and others
> aren't.
>

Ok, so maybe we should keep "markup", "tracing" and "profiling"
separately and see how things evolve.


> For example: I think profiling support is something REALLY FUNDAMENTAL.
> It's something each and every developer should generally care about, and
> OProfile should be considered an indispensable tool for any developer, on
> par with something like gdb.
>
> In contrast, we should *not* expect most people to do any kernel markers
> etc. That's a very esoteric thing.
>

With SMP systems becoming cheap commodity hardware, each and every
developer increasingly face thorny race problems, both in user-space
apps and in the kernel, which may involve hypervisor-kernel-userspace
interaction. Sadly, the blame is often put on kernel developers because
tools like gdb, oprofile and strace are practically useless to solve
such problems and people lack the right tool for the job.

Therefore, marking up the code to perform tracing should not be
considered esoteric: it's a very useful tool when one needs to
understand what is happening in their large scale system. Userspace
doesn't always have the ability to isolate problems and, worse, some
problems a just unreproduceable when tried to be isolated. I think it is
sensible to give them a tool that helps them understanding what is going
on.


> So I actually think that the current Kconfig.instrumentation should be
> *removed*. Rather than adding more groupings based on that fundamentally
> flawed premise of false commonality.
>

Should it come with a re-duplication of it's content into each
architecture, which was the case previously ? The oprofile and kprobes menu
entries were litteraly cut and pasted from one architecture to another.
Should we put its content in init/Kconfig then ?

Regards,

Mathieu

--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2007-10-30 19:33:07

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?



On Tue, 30 Oct 2007, Mathieu Desnoyers wrote:
>
> The key idea for collapsing profiling, markup and tracing was that
> marking up the code is required for both profiling and tracing. It's
> only the code that is called from that markup site that differs.

What code is actually shared?

Regardless, an internal implementation issue is *not* a good basis for a
user-visible interface.

> Ok, so maybe we should keep "markup", "tracing" and "profiling"
> separately and see how things evolve.

I think so. At least conceptually - ie it might be fine to share a Kconfig
file, but there probably shouldn't be some forced shared choice about it.

> With SMP systems becoming cheap commodity hardware, each and every
> developer increasingly face thorny race problems, both in user-space
> apps and in the kernel, which may involve hypervisor-kernel-userspace
> interaction.

Well, the thing is, most of the time, those app developers will not be
doing kernel-level markers. But they may well be doing profiling.

Speaking as an application developer myself (git), I care deeply about
good profiling info, and I love Oprofile. But even though I'm a kernel
person too, I'd not want to do kprobes. It's just not relevant to me as a
user-land developer.

(I might want to extend on strace, but if so, I'd do it generically, not
as a "probe". For example, I'd love to see the page faults, but I think
they really *are* "system calls", so I think it would make more sense to
extend on the ptrace interface than to have any kprobes thing)

> > So I actually think that the current Kconfig.instrumentation should be
> > *removed*. Rather than adding more groupings based on that
> > fundamentally flawed premise of false commonality.
>
> Should it come with a re-duplication of it's content into each
> architecture, which was the case previously ? The oprofile and kprobes
> menu entries were litteraly cut and pasted from one architecture to
> another. Should we put its content in init/Kconfig then ?

I don't think it's a good idea to go back to making it per-architecture,
although that extensive "depends on <list-of-archiectures-here>" might
indicate that there certainly is room for cleanup there.

And I don't think it's wrong keeping it in kernel/Kconfig.xyz per se, I
just think it's wrong to (a) lump the code together when it really doesn't
necessarily need to and (b) show it to users as some kind of choice that
is tied together (whether it then has common code or not).

On the per-architecture side, I do think it would be better to *not* have
internal architecture knowledge in a generic file, and as such a line like

depends on X86_32 || IA64 || PPC || S390 || SPARC64 || X86_64 || AVR32

really shouldn't exist in a file like kernel/Kconfig.instrumentation.

It would be much better to do

depends on ARCH_SUPPORTS_KPROBES

in that generic file, and then architectures that do support it would just
have a

bool ARCH_SUPPORTS_KPROBES
default y

in *their* architecture files. That would seem to be much more logical,
and is readable both for arch maintainers *and* for people who have no
clue - and don't care - about which architecture is supposed to support
which interface...

Linus

2007-10-30 20:46:00

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?

* Linus Torvalds ([email protected]) wrote:
>
>
> On Tue, 30 Oct 2007, Mathieu Desnoyers wrote:
> >
> > The key idea for collapsing profiling, markup and tracing was that
> > marking up the code is required for both profiling and tracing. It's
> > only the code that is called from that markup site that differs.
>
> What code is actually shared?
>

vmstat "counter increments" and blktrace instrumentation, profile.c
"profile_hits" calls could be all expressed as "generic markup", and
then used for profiling and tracing. But that would imply the creation
of a markup management that would permit it without hurting performance.

> Regardless, an internal implementation issue is *not* a good basis for a
> user-visible interface.
>

If we have to put it that way, code markup can be itself seen as a
user-visible interface. The marker name, if a particular analysis
depends on it, will have to keep its name unchanged. The same applies to
the arguments passed to it. Therefore, even though the scheduler code
changed a lot over the past 10 years, its context switch marker could always
be expressed as

trace_mark(kernel_sched_schedule,
"prev_pid %d next_pid %d prev_state %ld",
prev->pid, next->pid, prev->state);

Where kernel_sched_schedule and the format string field names are kept
unchanged. Only its location and the name of the variables it touches
could have to be modified to follow the kernel tree.


> > Ok, so maybe we should keep "markup", "tracing" and "profiling"
> > separately and see how things evolve.
>
> I think so. At least conceptually - ie it might be fine to share a Kconfig
> file, but there probably shouldn't be some forced shared choice about it.
>
> > With SMP systems becoming cheap commodity hardware, each and every
> > developer increasingly face thorny race problems, both in user-space
> > apps and in the kernel, which may involve hypervisor-kernel-userspace
> > interaction.
>
> Well, the thing is, most of the time, those app developers will not be
> doing kernel-level markers. But they may well be doing profiling.
>
> Speaking as an application developer myself (git), I care deeply about
> good profiling info, and I love Oprofile. But even though I'm a kernel
> person too, I'd not want to do kprobes. It's just not relevant to me as a
> user-land developer.
>
> (I might want to extend on strace, but if so, I'd do it generically, not
> as a "probe". For example, I'd love to see the page faults, but I think
> they really *are* "system calls", so I think it would make more sense to
> extend on the ptrace interface than to have any kprobes thing)
>

Since I am not a kprobe user myself, so I understand you completely. :)
What users expect when they try to fix that kind of issue, when oprofile
and gdb are not sufficient, is to start a data collection mechanism that
will tell them what is going in their system at large, without requiring
them to write kernel code.

However, that involves marking up key kernel code that will call into a
tracer to extract that information. Other projects has done this in
different ways.. SystemTAP, for instance, does it out of tree by keeping
a separate list of address where kprobes must be installed. It does the
job on a distribution kernel maintainer perspective (Redhat), since they
freeze to a particular kernel version and update this list every time it
breaks, but will always be a source of frustration for vanilla kernel
users and kernel developers. I think the best way to follow the code flow
is to add markup in the code itself: it would follow the kernel HEAD and
let each subsystem maintainer identify the key instrumentation sites of
their subsystem.

It's important to state that if anyone want to have his own marker set
in a separate patchset, he can do so. I currently have my own set of
markers to trace the most important kernel sites required to analyze and
show a trace of the Linux kernel in my LTTng kernel tracer. It's derived
from the set found in LTT which did not change much in about 8 years. I
could always submit that for comments to see how subsystem maintainers
will react to the proposed instrumentation.

About extending on ptrace, I am sorry to say that this solution has the
same downsides as kprobes: it is too slow for high performance
applications, especially if turned on system-wide. It will also change
the system behavior so much that it may hide the bugs and performance
issues people are struggling to find. Ptrace is very good at what it
does: looking inside _one_ application and tracing its system calls and
signals, but the approach finds its limits when we are trying to look at
the interactions between multiple applications and the kernel more
globally.


> > > So I actually think that the current Kconfig.instrumentation should be
> > > *removed*. Rather than adding more groupings based on that
> > > fundamentally flawed premise of false commonality.
> >
> > Should it come with a re-duplication of it's content into each
> > architecture, which was the case previously ? The oprofile and kprobes
> > menu entries were litteraly cut and pasted from one architecture to
> > another. Should we put its content in init/Kconfig then ?
>
> I don't think it's a good idea to go back to making it per-architecture,
> although that extensive "depends on <list-of-archiectures-here>" might
> indicate that there certainly is room for cleanup there.
>
> And I don't think it's wrong keeping it in kernel/Kconfig.xyz per se, I
> just think it's wrong to (a) lump the code together when it really doesn't
> necessarily need to and (b) show it to users as some kind of choice that
> is tied together (whether it then has common code or not).
>
> On the per-architecture side, I do think it would be better to *not* have
> internal architecture knowledge in a generic file, and as such a line like
>
> depends on X86_32 || IA64 || PPC || S390 || SPARC64 || X86_64 || AVR32
>
> really shouldn't exist in a file like kernel/Kconfig.instrumentation.
>
> It would be much better to do
>
> depends on ARCH_SUPPORTS_KPROBES
>
> in that generic file, and then architectures that do support it would just
> have a
>
> bool ARCH_SUPPORTS_KPROBES
> default y

Absolutely. Let's do it.

>
> in *their* architecture files. That would seem to be much more logical,
> and is readable both for arch maintainers *and* for people who have no
> clue - and don't care - about which architecture is supposed to support
> which interface...
>

Mathieu

--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2007-10-30 21:44:40

by Sam Ravnborg

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?

>
> Should it come with a re-duplication of it's content into each
> architecture, which was the case previously ? The oprofile and kprobes menu
> entries were litteraly cut and pasted from one architecture to another.
> Should we put its content in init/Kconfig then ?

Stuff it into a new file: arch/Kconfig
We can then extend this file to include all the 'trailing'
Kconfig things that are anyway equal for all ARCHs.

But it should be kept clean - so if we introduce such a file
then we should use ARCH_HAS_whatever in the arch specific Kconfig
files to enable stuff that is not shared.

Sam

2007-10-31 15:52:01

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?

Mathieu Desnoyers <[email protected]> writes:

> [...] SystemTAP, for instance, does it out of tree by keeping a
> separate list of address where kprobes must be installed. It does
> the job on a distribution kernel maintainer perspective (Redhat),
> since they freeze to a particular kernel version and update this
> list every time it breaks, but will always be a source of
> frustration for vanilla kernel users and kernel developers.

This misstates the details. What systemtap has out-of-tree is a list
of kernel function names (and parameter names), not addresses. This
list does change somewhat with kernel versions, but we generally keep
up. We do test with vanilla kernels, and several non-RH distributors
test with their kernels. It is a problem, but it is manageable.


> I think the best way to follow the code flow is to add markup in the
> code itself: it would follow the kernel HEAD and let each subsystem
> maintainer identify the key instrumentation sites of their
> subsystem.

Of course - when and where the dormant overheads are acceptable, and
where the maintainers are willing to commit to a long-term interface
(marker name/arguments). Systemtap can connect to markers as well as
to kprobes and other event sources: mix & match based on what's
available in your particular kernel and what data/computation you
want.


> About extending on ptrace, I am sorry to say that this solution has
> the same downsides as kprobes: it is too slow for high performance
> applications, especially if turned on system-wide. [...]

Roland McGrath's ptrace-replacement (utrace) should help with this.


- FChE

2007-10-31 16:30:04

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?

On Wed, 31 Oct 2007 11:48:20 -0400
[email protected] (Frank Ch. Eigler) wrote:

> Mathieu Desnoyers <[email protected]> writes:
>
> > [...] SystemTAP, for instance, does it out of tree by keeping a
> > separate list of address where kprobes must be installed. It does
> > the job on a distribution kernel maintainer perspective (Redhat),
> > since they freeze to a particular kernel version and update this
> > list every time it breaks, but will always be a source of
> > frustration for vanilla kernel users and kernel developers.
>
> This misstates the details. What systemtap has out-of-tree is a list
> of kernel function names (and parameter names), not addresses. This
> list does change somewhat with kernel versions, but we generally keep
> up. We do test with vanilla kernels, and several non-RH distributors
> test with their kernels. It is a problem, but it is manageable.

yes so please please submit this stuff for mainline inclusion as has
been asked quite a few times before.



--
If you want to reach me at my work email, use [email protected]
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2007-10-31 16:41:37

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?

* Frank Ch. Eigler ([email protected]) wrote:
> Mathieu Desnoyers <[email protected]> writes:
>
> > [...] SystemTAP, for instance, does it out of tree by keeping a
> > separate list of address where kprobes must be installed. It does
> > the job on a distribution kernel maintainer perspective (Redhat),
> > since they freeze to a particular kernel version and update this
> > list every time it breaks, but will always be a source of
> > frustration for vanilla kernel users and kernel developers.
>
> This misstates the details. What systemtap has out-of-tree is a list
> of kernel function names (and parameter names), not addresses. This
> list does change somewhat with kernel versions, but we generally keep
> up. We do test with vanilla kernels, and several non-RH distributors
> test with their kernels. It is a problem, but it is manageable.
>

That's right, Systemtap uses symbols, thanks for the clarification. But
my point is still valid: SystemTAP expects function names and argument
names to stay unchanged, therefore using the kernel code itself as an
API to userspace tools. The markers act as a buffer between what
important events userspace tools expect and the actual kernel code.

>
> > I think the best way to follow the code flow is to add markup in the
> > code itself: it would follow the kernel HEAD and let each subsystem
> > maintainer identify the key instrumentation sites of their
> > subsystem.
>
> Of course - when and where the dormant overheads are acceptable, and

I have not been able to detect a significant dormant marker overhead
with the immediate values optimization. A load immediate and a predicted
conditional jump are surprisingly cheap.

> where the maintainers are willing to commit to a long-term interface
> (marker name/arguments).

Yes. I expect that kind of mark-up to be kept minimalistic.

> Systemtap can connect to markers as well as
> to kprobes and other event sources: mix & match based on what's
> available in your particular kernel and what data/computation you
> want.
>

I think that SystemTAP's flexibility is great, but leads to fagileness
wrt kernel code changes. If the "core events" required by SystemTAP
(and also by LTTng by the way) could be turned into markers, I think it
would gain in robustness.

Providing the ability to instrument code locations with breakpoints, in
addition to this, will help users unsatisfied with the information
they have, unwilling to recompile their kernel or modules with their
own markers, ready to accept the two limitations :
- performance hit of a breakpoint
- unability to access variables within optimized functions

So yes, both approaches seems to be complementary.

>
> > About extending on ptrace, I am sorry to say that this solution has
> > the same downsides as kprobes: it is too slow for high performance
> > applications, especially if turned on system-wide. [...]
>
> Roland McGrath's ptrace-replacement (utrace) should help with this.
>

Yes, I think he did a good job at it. However, it is not a replacement
for the markers, SystemTAP or LTTng, because it defines a limited
set of hardcoded events (implying yet another type of code markup) that
is by itself a pain to extend. I am not willing to ask a subsystem
maintainer to do more than to "just identify" their important code
paths, the equivalent of adding a printk to their code. I don't think it
is realistic to ask them to create specialized callbacks for each of the
sites they would like to instrument.

So I would say : I'll try to submit a core set of markers patches for
review on LKML and see what people have to say.

Mathieu

>
> - FChE

--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2007-10-31 19:09:20

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?

Hi -

On Wed, Oct 31, 2007 at 09:29:07AM -0700, Arjan van de Ven wrote:
> [...]
> > This misstates the details. What systemtap has out-of-tree is a list
> > of kernel function names (and parameter names), not addresses. This
> > list does change somewhat with kernel versions, but we generally keep
> > up. We do test with vanilla kernels, and several non-RH distributors
> > test with their kernels. It is a problem, but it is manageable.
>
> yes so please please submit this stuff for mainline inclusion as has
> been asked quite a few times before.

OK, but I don't recall receiving a clear answer as to how you envision
this would work. Would you support distribution of some systemtap
script files in some new subdirectory?

- FChE

2007-10-31 19:34:27

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?

Hi -

On Wed, Oct 31, 2007 at 12:36:24PM -0400, Mathieu Desnoyers wrote:

> [...] That's right, Systemtap uses symbols, thanks for the
> clarification. But my point is still valid: SystemTAP expects
> function names and argument names to stay unchanged, therefore using
> the kernel code itself as an API to userspace tools. [...]

To be precise, this applies *kprobes*-based probes only. In
acceptance of this fragility, systemtap includes constructs (aliases,
version-dependent conditionals) to make it reasonably easy to adapt to
different kernel versions.

> I think that SystemTAP's [kprobes-based probes'] flexibility is
> great, but leads to fagileness wrt kernel code changes. If the "core
> events" required by SystemTAP (and also by LTTng by the way) could
> be turned into markers, I think it would gain in robustness.

Yes.

> Providing the ability to instrument code locations with breakpoints, in
> addition to this, will help users unsatisfied with the information
> they have, unwilling to recompile their kernel or modules with their
> own markers, ready to accept the two limitations :
> - performance hit of a breakpoint
> - unability to access variables within optimized functions

That latter point has been repeatedly overstated. Markers provides a
fixed set of values. kprobes/dwarf provides access to any statements
and any values (including locals) that a compiler did not altogether
elide. While the latter set is by its nature variable, it will be
much bigger than anything a reasonable set of markers will ever
expose.

> So yes, both approaches seems to be complementary.

Indeed.

> > > About extending on ptrace, I am sorry to say that this solution has
> > > the same downsides as kprobes: it is too slow for high performance
> > > applications, especially if turned on system-wide. [...]
> >
> > Roland McGrath's ptrace-replacement (utrace) should help with this.
>
> Yes, I think he did a good job at it. However, it is not a replacement
> for the markers [...]

Right, not as a whole, but it *could* be an alternative way to hook
into system call type events.

> [...]
> So I would say : I'll try to submit a core set of markers patches for
> review on LKML and see what people have to say.

Thank you. Our team is already in contact to help.

- FChE

2007-10-31 19:50:29

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [RFC] Create kinst/ or ki/ directory ?

On Wed, 31 Oct 2007 15:05:06 -0400
"Frank Ch. Eigler" <[email protected]> wrote:

> Hi -
>
> On Wed, Oct 31, 2007 at 09:29:07AM -0700, Arjan van de Ven wrote:
> > [...]
> > > This misstates the details. What systemtap has out-of-tree is a
> > > list of kernel function names (and parameter names), not
> > > addresses. This list does change somewhat with kernel versions,
> > > but we generally keep up. We do test with vanilla kernels, and
> > > several non-RH distributors test with their kernels. It is a
> > > problem, but it is manageable.
> >
> > yes so please please submit this stuff for mainline inclusion as has
> > been asked quite a few times before.
>
> OK, but I don't recall receiving a clear answer as to how you envision
> this would work. Would you support distribution of some systemtap
> script files in some new subdirectory?

yes absolutely.