2009-03-05 23:21:56

by Mathieu Desnoyers

[permalink] [raw]
Subject: [RFC patch 00/41] LTTng 0.105 core for Linux 2.6.27-rc9

Hi,

I spent the last 4-5 months working with the Fujitsu team at implementing the
tracer elements identified as goals at Kernel Summit 2008 and at the following
Plumber Conference. My idea was to incremententally adapt the LTTng tracer,
currently used in the industry and well tested, to those requirements.

I spent the last days rearranging/folding/inspecting the LTTng patchset
to prepare it for an LKML post. The version 0.105 in the LTTng git tree
corresponds to the patchset I am posting here. The said patchset will
only include the core features of LTTng, excluding the timestamping
infrastructure (trace clock) and excluding the instrumentation.

The corresponding git tree contains also the trace clock patches and the lttng
instrumentation. The trace clock is required to use the tracer, but it can be
used without the instrumentation : there is already a kprobes and userspace
event support included in this patchset.

This tracer exports binary data through buffers using splice(). The resulting
binary files can be parsed from userspace because the format string metadata is
exported in the files. The event set can be enhanced by adding tracepoints to
the kernel code and by creating probe modules, which connects callbacks to the
tracepoints and contain the format string metainformation. Those callbacks are
responsible for writing the data in the trace buffers. This separation between
the trace buffer format string and the tracepoints is done on purpose so the
core kernel instrumentation (tracepoints) is not exported to userspace, which
will make maintainance much easier.

The tree including the trace clock patches is available at :

git://git.kernel.org/pub/scm/linux/kernel/git/compudj/linux-2.6-lttng.git
branch : 2.6.29-rc7-lttng-0.105

Project website : http://www.lttng.org/

Information about how to install and use the tracer is available at :

http://ltt.polymtl.ca/svn/trunk/lttv/LTTngManual.html

The size of the LTTng core patchset is 41 patches. The diffstat details
as follow :

include/linux/ltt-core.h | 35
include/linux/ltt-relay.h | 161 +
include/linux/ltt-tracer.h | 43
include/linux/marker.h | 121
kernel/marker.c | 353 ++
kernel/module.c | 31
linux-2.6-lttng/Documentation/markers.txt | 17
linux-2.6-lttng/MAINTAINERS | 7
linux-2.6-lttng/Makefile | 2
linux-2.6-lttng/arch/powerpc/kernel/traps.c | 5
linux-2.6-lttng/arch/powerpc/platforms/cell/spufs/spufs.h | 6
linux-2.6-lttng/arch/sparc/Makefile | 2
linux-2.6-lttng/arch/x86/kernel/dumpstack.c | 5
linux-2.6-lttng/arch/x86/mm/fault.c | 1
linux-2.6-lttng/fs/ext4/fsync.c | 8
linux-2.6-lttng/fs/ext4/ialloc.c | 17
linux-2.6-lttng/fs/ext4/inode.c | 79
linux-2.6-lttng/fs/ext4/mballoc.c | 71
linux-2.6-lttng/fs/ext4/mballoc.h | 2
linux-2.6-lttng/fs/ext4/super.c | 6
linux-2.6-lttng/fs/jbd2/checkpoint.c | 7
linux-2.6-lttng/fs/jbd2/commit.c | 12
linux-2.6-lttng/fs/pipe.c | 5
linux-2.6-lttng/fs/select.c | 41
linux-2.6-lttng/fs/seq_file.c | 45
linux-2.6-lttng/fs/splice.c | 1
linux-2.6-lttng/include/linux/immediate.h | 94
linux-2.6-lttng/include/linux/kvm_host.h | 12
linux-2.6-lttng/include/linux/ltt-channels.h | 94
linux-2.6-lttng/include/linux/ltt-core.h | 47
linux-2.6-lttng/include/linux/ltt-relay.h | 186 +
linux-2.6-lttng/include/linux/ltt-tracer.h | 731 ++++++
linux-2.6-lttng/include/linux/ltt-type-serializer.h | 107
linux-2.6-lttng/include/linux/marker.h | 16
linux-2.6-lttng/include/linux/module.h | 6
linux-2.6-lttng/include/linux/poll.h | 2
linux-2.6-lttng/include/linux/seq_file.h | 20
linux-2.6-lttng/include/trace/ext4.h | 129 +
linux-2.6-lttng/include/trace/jbd2.h | 19
linux-2.6-lttng/init/Kconfig | 2
linux-2.6-lttng/kernel/kallsyms.c | 1
linux-2.6-lttng/kernel/marker.c | 12
linux-2.6-lttng/kernel/module.c | 32
linux-2.6-lttng/ltt/Kconfig | 130 +
linux-2.6-lttng/ltt/Makefile | 15
linux-2.6-lttng/ltt/ltt-channels.c | 338 ++
linux-2.6-lttng/ltt/ltt-core.c | 101
linux-2.6-lttng/ltt/ltt-filter.c | 66
linux-2.6-lttng/ltt/ltt-kprobes.c | 479 +++
linux-2.6-lttng/ltt/ltt-marker-control.c | 265 ++
linux-2.6-lttng/ltt/ltt-relay-alloc.c | 715 +++++
linux-2.6-lttng/ltt/ltt-relay-locked.c | 1704 ++++++++++++++
linux-2.6-lttng/ltt/ltt-serialize.c | 685 +++++
linux-2.6-lttng/ltt/ltt-trace-control.c | 1061 ++++++++
linux-2.6-lttng/ltt/ltt-tracer.c | 1210 +++++++++
linux-2.6-lttng/ltt/ltt-type-serializer.c | 96
linux-2.6-lttng/ltt/ltt-userspace-event.c | 131 +
linux-2.6-lttng/samples/markers/Makefile | 2
linux-2.6-lttng/samples/markers/marker-example.c | 4
linux-2.6-lttng/samples/markers/probe-example.c | 10
linux-2.6-lttng/samples/markers/test-multi.c | 120
linux-2.6-lttng/virt/kvm/kvm_trace.c | 12
ltt/Kconfig | 24
ltt/Makefile | 2
ltt/ltt-relay-alloc.c | 80
65 files changed, 9445 insertions(+), 398 deletions(-)


Comments are welcome.

Mathieu


--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68


2009-03-06 10:12:34

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC patch 00/41] LTTng 0.105 core for Linux 2.6.27-rc9


* Mathieu Desnoyers <[email protected]> wrote:

> Hi,
>
> I spent the last 4-5 months working with the Fujitsu team at
> implementing the tracer elements identified as goals at Kernel
> Summit 2008 and at the following Plumber Conference. My idea
> was to incremententally adapt the LTTng tracer, currently used
> in the industry and well tested, to those requirements.
>
> I spent the last days rearranging/folding/inspecting the LTTng
> patchset to prepare it for an LKML post. The version 0.105 in
> the LTTng git tree corresponds to the patchset I am posting
> here. The said patchset will only include the core features of
> LTTng, excluding the timestamping infrastructure (trace clock)
> and excluding the instrumentation.

I'd like to merge the good bits into the tracing tree. Looking
at the patches you submitted there's a lot of avoidable overlap
with existing tracing features either present upstream already
or queued up for v2.6.30 - and we need to work more on
eliminating that overlap.

I dont think there's much fundamental disagreement just
different implementations - so we should evaluate each of those
details one by one, iteratively.

The first step would be to split the patches up into three
logical buckets:

- Unique features not present in the tracing infracture, in the
event tracer or other tracing plugins - those should be
structured as feature additions.

- Features that you consider superior to existing tracing
features of the kernel. For those, please iterate the
existing code with your enhancements - instead of a parallel
implementation.

- Items which offer nothing new and are not superior to
existing features, those should be dropped probably. This too
is a case by case thing.

Would you be interested in working with us on that? I know that
both Steve and me would be very much interested in that. If you
have time/interest to work on that then we can go through each
patch one by one and categorize them and map out the way to go.

Let me give you a few examples of existing areas of overlap:

> The corresponding git tree contains also the trace clock
> patches and the lttng instrumentation. The trace clock is
> required to use the tracer, but it can be used without the
> instrumentation : there is already a kprobes and userspace
> event support included in this patchset.

The latest tracing tree includes kernel/tracing/trace_clock.c
which offers three trace clock variants, with different
performance/precision tradeoffs:

trace_clock_local() [ for pure CPU-local tracers with no idle
events. This is the fastest but least
coherent tracing clock. ]

trace_clock() [ intermediate, scalable clock with
usable but imprecise global coherency. ]

trace_clock_global() [ globally serialized, coherent clock.
It is the slowest but most accurate variant. ]

Tracing plugins can pick their choice. (This is relatively new
code but you get the idea.)

> This tracer exports binary data through buffers using
> splice(). The resulting binary files can be parsed from
> userspace because the format string metadata is exported in
> the files. The event set can be enhanced by adding tracepoints
> to the kernel code and by creating probe modules, which
> connects callbacks to the tracepoints and contain the format
> string metainformation. Those callbacks are responsible for
> writing the data in the trace buffers. This separation between
> the trace buffer format string and the tracepoints is done on
> purpose so the core kernel instrumentation (tracepoints) is
> not exported to userspace, which will make maintainance much
> easier.

A tracepoint format specification mechanism plus working (and
fast!) zero-copy splice() support of the ring-buffer exists in
the latest tracing tree already - as you are probably aware of
because you commented on those patches a few days ago.

There are 3 good ways to go from here regarding the trace
buffering and splice code:

1- we end up switching to the lttng version in essence
2- we end up keeping the tracing tree version
3- we end up somewhere inbetween

Which point in the above spectrum we will settle down on depends
on the technical details.

Note, whichever path we choose a gradual, iterative workflow is
still needed, so that we improve the existing upstream code with
lttng enhancements gradually.

This approach works for all your other patches as well. A
direct, constructive comparison and active work on unifying them
is required.

Thanks,

Ingo

2009-03-06 18:34:58

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC patch 00/41] LTTng 0.105 core for Linux 2.6.27-rc9


Hi Mathieu,

Thanks for posting this. But it might be better to post in much smaller
chunks. Lets work out the little things first. Posting 41 patches is a bit
overwhelming. Took me a few hours to look at them all, and when I got to
the end, I forgot what was at the beginning.

There's also minor changes to core kernel infrastructure code. seq_file,
exporting functions, and such. These really need to be packaged
separately, and sent to the proper maintainers. Having them in a patch
bomb does not get the proper focus that they need.


On Thu, 5 Mar 2009, Mathieu Desnoyers wrote:

> Hi,
>
> I spent the last 4-5 months working with the Fujitsu team at implementing the
> tracer elements identified as goals at Kernel Summit 2008 and at the following
> Plumber Conference. My idea was to incremententally adapt the LTTng tracer,
> currently used in the industry and well tested, to those requirements.

We really need to work together on this too. The biggest requirement that
came out of that conference was to have a "single unified buffering
system". And this was discussed quite heavily on LKML afterwards. All
development was done incrementally and publicly.

>
> I spent the last days rearranging/folding/inspecting the LTTng patchset
> to prepare it for an LKML post. The version 0.105 in the LTTng git tree
> corresponds to the patchset I am posting here. The said patchset will
> only include the core features of LTTng, excluding the timestamping
> infrastructure (trace clock) and excluding the instrumentation.
>
> The corresponding git tree contains also the trace clock patches and the lttng
> instrumentation. The trace clock is required to use the tracer, but it can be
> used without the instrumentation : there is already a kprobes and userspace
> event support included in this patchset.
>
> This tracer exports binary data through buffers using splice(). The resulting
> binary files can be parsed from userspace because the format string metadata is
> exported in the files. The event set can be enhanced by adding tracepoints to
> the kernel code and by creating probe modules, which connects callbacks to the
> tracepoints and contain the format string metainformation. Those callbacks are
> responsible for writing the data in the trace buffers. This separation between
> the trace buffer format string and the tracepoints is done on purpose so the
> core kernel instrumentation (tracepoints) is not exported to userspace, which
> will make maintainance much easier.

I've discussed this in my previous email. There is still a separation with
the TRACE_EVENT_FORMAT and the maintainers code. The format sting is
"hint" only and may change without notice. LTTng could use it or ignore
it, it is up to the tracer to actually export that string. ftrace chose to
export it because it was a simple way to extract that information. My
utility will need to do a bit more work when the events get more complex,
but the way it is set up, we can do that on a case by case basis.


>
> The tree including the trace clock patches is available at :
>
> git://git.kernel.org/pub/scm/linux/kernel/git/compudj/linux-2.6-lttng.git
> branch : 2.6.29-rc7-lttng-0.105
>
> Project website : http://www.lttng.org/
>
> Information about how to install and use the tracer is available at :
>
> http://ltt.polymtl.ca/svn/trunk/lttv/LTTngManual.html
>
> The size of the LTTng core patchset is 41 patches. The diffstat details
> as follow :


Again, this is overwhelming. This needs to be broken up into a small
subsets that can be examined piece by piece.

>
> include/linux/ltt-core.h | 35
> include/linux/ltt-relay.h | 161 +
> include/linux/ltt-tracer.h | 43
> include/linux/marker.h | 121
> kernel/marker.c | 353 ++
> kernel/module.c | 31
> linux-2.6-lttng/Documentation/markers.txt | 17
> linux-2.6-lttng/MAINTAINERS | 7
> linux-2.6-lttng/Makefile | 2
> linux-2.6-lttng/arch/powerpc/kernel/traps.c | 5
> linux-2.6-lttng/arch/powerpc/platforms/cell/spufs/spufs.h | 6
> linux-2.6-lttng/arch/sparc/Makefile | 2
> linux-2.6-lttng/arch/x86/kernel/dumpstack.c | 5
> linux-2.6-lttng/arch/x86/mm/fault.c | 1
> linux-2.6-lttng/fs/ext4/fsync.c | 8
> linux-2.6-lttng/fs/ext4/ialloc.c | 17
> linux-2.6-lttng/fs/ext4/inode.c | 79
> linux-2.6-lttng/fs/ext4/mballoc.c | 71
> linux-2.6-lttng/fs/ext4/mballoc.h | 2
> linux-2.6-lttng/fs/ext4/super.c | 6
> linux-2.6-lttng/fs/jbd2/checkpoint.c | 7
> linux-2.6-lttng/fs/jbd2/commit.c | 12
> linux-2.6-lttng/fs/pipe.c | 5
> linux-2.6-lttng/fs/select.c | 41
> linux-2.6-lttng/fs/seq_file.c | 45
> linux-2.6-lttng/fs/splice.c | 1

There is a lot of code above that needs to be in their own patch series.
Maintainers do not have the time to pick through 41 patches to find out
which patch might deal with their code.

Thanks,

-- Steve


> linux-2.6-lttng/include/linux/immediate.h | 94
> linux-2.6-lttng/include/linux/kvm_host.h | 12
> linux-2.6-lttng/include/linux/ltt-channels.h | 94
> linux-2.6-lttng/include/linux/ltt-core.h | 47
> linux-2.6-lttng/include/linux/ltt-relay.h | 186 +
> linux-2.6-lttng/include/linux/ltt-tracer.h | 731 ++++++
> linux-2.6-lttng/include/linux/ltt-type-serializer.h | 107
> linux-2.6-lttng/include/linux/marker.h | 16
> linux-2.6-lttng/include/linux/module.h | 6
> linux-2.6-lttng/include/linux/poll.h | 2
> linux-2.6-lttng/include/linux/seq_file.h | 20
> linux-2.6-lttng/include/trace/ext4.h | 129 +
> linux-2.6-lttng/include/trace/jbd2.h | 19
> linux-2.6-lttng/init/Kconfig | 2
> linux-2.6-lttng/kernel/kallsyms.c | 1
> linux-2.6-lttng/kernel/marker.c | 12
> linux-2.6-lttng/kernel/module.c | 32
> linux-2.6-lttng/ltt/Kconfig | 130 +
> linux-2.6-lttng/ltt/Makefile | 15
> linux-2.6-lttng/ltt/ltt-channels.c | 338 ++
> linux-2.6-lttng/ltt/ltt-core.c | 101
> linux-2.6-lttng/ltt/ltt-filter.c | 66
> linux-2.6-lttng/ltt/ltt-kprobes.c | 479 +++
> linux-2.6-lttng/ltt/ltt-marker-control.c | 265 ++
> linux-2.6-lttng/ltt/ltt-relay-alloc.c | 715 +++++
> linux-2.6-lttng/ltt/ltt-relay-locked.c | 1704 ++++++++++++++
> linux-2.6-lttng/ltt/ltt-serialize.c | 685 +++++
> linux-2.6-lttng/ltt/ltt-trace-control.c | 1061 ++++++++
> linux-2.6-lttng/ltt/ltt-tracer.c | 1210 +++++++++
> linux-2.6-lttng/ltt/ltt-type-serializer.c | 96
> linux-2.6-lttng/ltt/ltt-userspace-event.c | 131 +
> linux-2.6-lttng/samples/markers/Makefile | 2
> linux-2.6-lttng/samples/markers/marker-example.c | 4
> linux-2.6-lttng/samples/markers/probe-example.c | 10
> linux-2.6-lttng/samples/markers/test-multi.c | 120
> linux-2.6-lttng/virt/kvm/kvm_trace.c | 12
> ltt/Kconfig | 24
> ltt/Makefile | 2
> ltt/ltt-relay-alloc.c | 80
> 65 files changed, 9445 insertions(+), 398 deletions(-)
>
>
> Comments are welcome.
>
> Mathieu
>
>
> --
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
>

2009-03-06 19:02:22

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [RFC patch 00/41] LTTng 0.105 core for Linux 2.6.27-rc9

On Fri, Mar 06, 2009 at 01:34:43PM -0500, Steven Rostedt wrote:
>
> Hi Mathieu,
>
> Thanks for posting this. But it might be better to post in much smaller
> chunks. Lets work out the little things first. Posting 41 patches is a bit
> overwhelming. Took me a few hours to look at them all, and when I got to
> the end, I forgot what was at the beginning.
>
> There's also minor changes to core kernel infrastructure code. seq_file,
> exporting functions, and such. These really need to be packaged
> separately, and sent to the proper maintainers. Having them in a patch
> bomb does not get the proper focus that they need.
>


Yes, I must confess I tried to review some of them but I have been discouraged
by the high volume and the multiple subjects that come with.

Iterating with smaller topics at a time, more focused subjects would help us to bring
the attention it deserves.

Frederic.


> On Thu, 5 Mar 2009, Mathieu Desnoyers wrote:
>
> > Hi,
> >
> > I spent the last 4-5 months working with the Fujitsu team at implementing the
> > tracer elements identified as goals at Kernel Summit 2008 and at the following
> > Plumber Conference. My idea was to incremententally adapt the LTTng tracer,
> > currently used in the industry and well tested, to those requirements.
>
> We really need to work together on this too. The biggest requirement that
> came out of that conference was to have a "single unified buffering
> system". And this was discussed quite heavily on LKML afterwards. All
> development was done incrementally and publicly.
> >
> > I spent the last days rearranging/folding/inspecting the LTTng patchset
> > to prepare it for an LKML post. The version 0.105 in the LTTng git tree
> > corresponds to the patchset I am posting here. The said patchset will
> > only include the core features of LTTng, excluding the timestamping
> > infrastructure (trace clock) and excluding the instrumentation.
> >
> > The corresponding git tree contains also the trace clock patches and the lttng
> > instrumentation. The trace clock is required to use the tracer, but it can be
> > used without the instrumentation : there is already a kprobes and userspace
> > event support included in this patchset.
> >
> > This tracer exports binary data through buffers using splice(). The resulting
> > binary files can be parsed from userspace because the format string metadata is
> > exported in the files. The event set can be enhanced by adding tracepoints to
> > the kernel code and by creating probe modules, which connects callbacks to the
> > tracepoints and contain the format string metainformation. Those callbacks are
> > responsible for writing the data in the trace buffers. This separation between
> > the trace buffer format string and the tracepoints is done on purpose so the
> > core kernel instrumentation (tracepoints) is not exported to userspace, which
> > will make maintainance much easier.
>
> I've discussed this in my previous email. There is still a separation with
> the TRACE_EVENT_FORMAT and the maintainers code. The format sting is
> "hint" only and may change without notice. LTTng could use it or ignore
> it, it is up to the tracer to actually export that string. ftrace chose to
> export it because it was a simple way to extract that information. My
> utility will need to do a bit more work when the events get more complex,
> but the way it is set up, we can do that on a case by case basis.
>
>
> >
> > The tree including the trace clock patches is available at :
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/compudj/linux-2.6-lttng.git
> > branch : 2.6.29-rc7-lttng-0.105
> >
> > Project website : http://www.lttng.org/
> >
> > Information about how to install and use the tracer is available at :
> >
> > http://ltt.polymtl.ca/svn/trunk/lttv/LTTngManual.html
> >
> > The size of the LTTng core patchset is 41 patches. The diffstat details
> > as follow :
>
>
> Again, this is overwhelming. This needs to be broken up into a small
> subsets that can be examined piece by piece.
>
> >
> > include/linux/ltt-core.h | 35
> > include/linux/ltt-relay.h | 161 +
> > include/linux/ltt-tracer.h | 43
> > include/linux/marker.h | 121
> > kernel/marker.c | 353 ++
> > kernel/module.c | 31
> > linux-2.6-lttng/Documentation/markers.txt | 17
> > linux-2.6-lttng/MAINTAINERS | 7
> > linux-2.6-lttng/Makefile | 2
> > linux-2.6-lttng/arch/powerpc/kernel/traps.c | 5
> > linux-2.6-lttng/arch/powerpc/platforms/cell/spufs/spufs.h | 6
> > linux-2.6-lttng/arch/sparc/Makefile | 2
> > linux-2.6-lttng/arch/x86/kernel/dumpstack.c | 5
> > linux-2.6-lttng/arch/x86/mm/fault.c | 1
> > linux-2.6-lttng/fs/ext4/fsync.c | 8
> > linux-2.6-lttng/fs/ext4/ialloc.c | 17
> > linux-2.6-lttng/fs/ext4/inode.c | 79
> > linux-2.6-lttng/fs/ext4/mballoc.c | 71
> > linux-2.6-lttng/fs/ext4/mballoc.h | 2
> > linux-2.6-lttng/fs/ext4/super.c | 6
> > linux-2.6-lttng/fs/jbd2/checkpoint.c | 7
> > linux-2.6-lttng/fs/jbd2/commit.c | 12
> > linux-2.6-lttng/fs/pipe.c | 5
> > linux-2.6-lttng/fs/select.c | 41
> > linux-2.6-lttng/fs/seq_file.c | 45
> > linux-2.6-lttng/fs/splice.c | 1
>
> There is a lot of code above that needs to be in their own patch series.
> Maintainers do not have the time to pick through 41 patches to find out
> which patch might deal with their code.
>
> Thanks,
>
> -- Steve
>
>
> > linux-2.6-lttng/include/linux/immediate.h | 94
> > linux-2.6-lttng/include/linux/kvm_host.h | 12
> > linux-2.6-lttng/include/linux/ltt-channels.h | 94
> > linux-2.6-lttng/include/linux/ltt-core.h | 47
> > linux-2.6-lttng/include/linux/ltt-relay.h | 186 +
> > linux-2.6-lttng/include/linux/ltt-tracer.h | 731 ++++++
> > linux-2.6-lttng/include/linux/ltt-type-serializer.h | 107
> > linux-2.6-lttng/include/linux/marker.h | 16
> > linux-2.6-lttng/include/linux/module.h | 6
> > linux-2.6-lttng/include/linux/poll.h | 2
> > linux-2.6-lttng/include/linux/seq_file.h | 20
> > linux-2.6-lttng/include/trace/ext4.h | 129 +
> > linux-2.6-lttng/include/trace/jbd2.h | 19
> > linux-2.6-lttng/init/Kconfig | 2
> > linux-2.6-lttng/kernel/kallsyms.c | 1
> > linux-2.6-lttng/kernel/marker.c | 12
> > linux-2.6-lttng/kernel/module.c | 32
> > linux-2.6-lttng/ltt/Kconfig | 130 +
> > linux-2.6-lttng/ltt/Makefile | 15
> > linux-2.6-lttng/ltt/ltt-channels.c | 338 ++
> > linux-2.6-lttng/ltt/ltt-core.c | 101
> > linux-2.6-lttng/ltt/ltt-filter.c | 66
> > linux-2.6-lttng/ltt/ltt-kprobes.c | 479 +++
> > linux-2.6-lttng/ltt/ltt-marker-control.c | 265 ++
> > linux-2.6-lttng/ltt/ltt-relay-alloc.c | 715 +++++
> > linux-2.6-lttng/ltt/ltt-relay-locked.c | 1704 ++++++++++++++
> > linux-2.6-lttng/ltt/ltt-serialize.c | 685 +++++
> > linux-2.6-lttng/ltt/ltt-trace-control.c | 1061 ++++++++
> > linux-2.6-lttng/ltt/ltt-tracer.c | 1210 +++++++++
> > linux-2.6-lttng/ltt/ltt-type-serializer.c | 96
> > linux-2.6-lttng/ltt/ltt-userspace-event.c | 131 +
> > linux-2.6-lttng/samples/markers/Makefile | 2
> > linux-2.6-lttng/samples/markers/marker-example.c | 4
> > linux-2.6-lttng/samples/markers/probe-example.c | 10
> > linux-2.6-lttng/samples/markers/test-multi.c | 120
> > linux-2.6-lttng/virt/kvm/kvm_trace.c | 12
> > ltt/Kconfig | 24
> > ltt/Makefile | 2
> > ltt/ltt-relay-alloc.c | 80
> > 65 files changed, 9445 insertions(+), 398 deletions(-)
> >
> >
> > Comments are welcome.
> >
> > Mathieu
> >
> >
> > --
> > Mathieu Desnoyers
> > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
> >

2009-03-06 19:02:43

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC patch 00/41] LTTng 0.105 core for Linux 2.6.27-rc9

* Ingo Molnar ([email protected]) wrote:
>
> * Mathieu Desnoyers <[email protected]> wrote:
>
> > Hi,
> >
> > I spent the last 4-5 months working with the Fujitsu team at
> > implementing the tracer elements identified as goals at Kernel
> > Summit 2008 and at the following Plumber Conference. My idea
> > was to incremententally adapt the LTTng tracer, currently used
> > in the industry and well tested, to those requirements.
> >
> > I spent the last days rearranging/folding/inspecting the LTTng
> > patchset to prepare it for an LKML post. The version 0.105 in
> > the LTTng git tree corresponds to the patchset I am posting
> > here. The said patchset will only include the core features of
> > LTTng, excluding the timestamping infrastructure (trace clock)
> > and excluding the instrumentation.
>
> I'd like to merge the good bits into the tracing tree. Looking
> at the patches you submitted there's a lot of avoidable overlap
> with existing tracing features either present upstream already
> or queued up for v2.6.30 - and we need to work more on
> eliminating that overlap.
>
> I dont think there's much fundamental disagreement just
> different implementations - so we should evaluate each of those
> details one by one, iteratively.
>
> The first step would be to split the patches up into three
> logical buckets:
>
> - Unique features not present in the tracing infracture, in the
> event tracer or other tracing plugins - those should be
> structured as feature additions.
>
> - Features that you consider superior to existing tracing
> features of the kernel. For those, please iterate the
> existing code with your enhancements - instead of a parallel
> implementation.
>
> - Items which offer nothing new and are not superior to
> existing features, those should be dropped probably. This too
> is a case by case thing.
>
> Would you be interested in working with us on that? I know that
> both Steve and me would be very much interested in that. If you
> have time/interest to work on that then we can go through each
> patch one by one and categorize them and map out the way to go.
>

Hi Ingo,

Yes, I think an incremental inclusion is the way to go in the current
context. If we do it correctly, the resulting discussion will end up
putting the best features of both tracers in the resulting one. The only
problem is that I have less time at the moment (I should be writing my
Ph.D. thesis full time), but I think it's very important at this stage
to interact with the kernel community so everyone can benefit of the
work done in the past years.

I guess that identifying the good parts in each tracer will be a first
step towards integration. If you want, I could start by replying to my
own patchset post and do a ftrace-lttng comparison on each important
item.

I don't know how much time I'll be able to put into refactoring all this
though : I just spent 4 years developing LTTng and making sure every
nuts and bolts fits together fine. Hopefully we'll be able to keep the
modifications as lightweight and as iterative as possible.

> Let me give you a few examples of existing areas of overlap:
>
> > The corresponding git tree contains also the trace clock
> > patches and the lttng instrumentation. The trace clock is
> > required to use the tracer, but it can be used without the
> > instrumentation : there is already a kprobes and userspace
> > event support included in this patchset.
>
> The latest tracing tree includes kernel/tracing/trace_clock.c
> which offers three trace clock variants, with different
> performance/precision tradeoffs:
>
> trace_clock_local() [ for pure CPU-local tracers with no idle
> events. This is the fastest but least
> coherent tracing clock. ]
>
> trace_clock() [ intermediate, scalable clock with
> usable but imprecise global coherency. ]
>
> trace_clock_global() [ globally serialized, coherent clock.
> It is the slowest but most accurate variant. ]
>
> Tracing plugins can pick their choice. (This is relatively new
> code but you get the idea.)
>

Hehe this reminds me of the trace clock thread I started a few months
ago on LKML. So you guys took over that work ? Nice :) Is it based on
the trace-clock patches I proposed back then ? Ah, no. Well I guess
we'll have to discuss this too. I agree on the
trace_clock_local/trace_clock/trace_clock_global interface, it looks
nice. The underlying implementation will have to be discussed though.


> > This tracer exports binary data through buffers using
> > splice(). The resulting binary files can be parsed from
> > userspace because the format string metadata is exported in
> > the files. The event set can be enhanced by adding tracepoints
> > to the kernel code and by creating probe modules, which
> > connects callbacks to the tracepoints and contain the format
> > string metainformation. Those callbacks are responsible for
> > writing the data in the trace buffers. This separation between
> > the trace buffer format string and the tracepoints is done on
> > purpose so the core kernel instrumentation (tracepoints) is
> > not exported to userspace, which will make maintainance much
> > easier.
>
> A tracepoint format specification mechanism plus working (and
> fast!) zero-copy splice() support of the ring-buffer exists in
> the latest tracing tree already - as you are probably aware of
> because you commented on those patches a few days ago.

Yep, I know. :)

>
> There are 3 good ways to go from here regarding the trace
> buffering and splice code:
>
> 1- we end up switching to the lttng version in essence
> 2- we end up keeping the tracing tree version
> 3- we end up somewhere inbetween
>
> Which point in the above spectrum we will settle down on depends
> on the technical details.
>
> Note, whichever path we choose a gradual, iterative workflow is
> still needed, so that we improve the existing upstream code with
> lttng enhancements gradually.
>
> This approach works for all your other patches as well. A
> direct, constructive comparison and active work on unifying them
> is required.
>

Yes, let's try to do it. Maybe it's better to start a new thread with
less CCs for this type of work ?

Mathieu

> Thanks,
>
> Ingo

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-03-11 18:34:20

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC patch 00/41] LTTng 0.105 core for Linux 2.6.27-rc9


* Mathieu Desnoyers <[email protected]> wrote:

> > Let me give you a few examples of existing areas of overlap:
> >
> > > The corresponding git tree contains also the trace clock
> > > patches and the lttng instrumentation. The trace clock is
> > > required to use the tracer, but it can be used without the
> > > instrumentation : there is already a kprobes and userspace
> > > event support included in this patchset.
> >
> > The latest tracing tree includes
> > kernel/tracing/trace_clock.c which offers three trace clock
> > variants, with different performance/precision tradeoffs:
> >
> > trace_clock_local() [ for pure CPU-local tracers with no idle
> > events. This is the fastest but least
> > coherent tracing clock. ]
> >
> > trace_clock() [ intermediate, scalable clock with
> > usable but imprecise global coherency. ]
> >
> > trace_clock_global() [ globally serialized, coherent clock.
> > It is the slowest but most accurate variant. ]
> >
> > Tracing plugins can pick their choice. (This is relatively new
> > code but you get the idea.)
> >
>
> Hehe this reminds me of the trace clock thread I started a few
> months ago on LKML. So you guys took over that work ? Nice :)
> Is it based on the trace-clock patches I proposed back then ?
> Ah, no. Well I guess we'll have to discuss this too. I agree
> on the trace_clock_local/trace_clock/trace_clock_global
> interface, it looks nice. The underlying implementation will
> have to be discussed though.

Beware: i found the assembly trace_clock() stuff you did back
then rather ugly ;-) I dont think there's any easy solutions
here, so i went for this palette of clocks.

> > This approach works for all your other patches as well. A
> > direct, constructive comparison and active work on unifying
> > them is required.
>
> Yes, let's try to do it. Maybe it's better to start a new
> thread with less CCs for this type of work ?

Yeah. More finegrained steps are really needed.

The least controversial bits would be the many tracepoints you
identified in LTTng as interesting. Mind sending them separately
so that we can make some progress?

In the latest tracing code all tracepoints will show up
automatically under /debug/tracing/events/ and can be used by
user-space tools.

Ingo

2009-03-13 16:19:15

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC patch 00/41] LTTng 0.105 core for Linux 2.6.27-rc9

* Ingo Molnar ([email protected]) wrote:
>
> * Mathieu Desnoyers <[email protected]> wrote:
>
> > > Let me give you a few examples of existing areas of overlap:
> > >
> > > > The corresponding git tree contains also the trace clock
> > > > patches and the lttng instrumentation. The trace clock is
> > > > required to use the tracer, but it can be used without the
> > > > instrumentation : there is already a kprobes and userspace
> > > > event support included in this patchset.
> > >
> > > The latest tracing tree includes
> > > kernel/tracing/trace_clock.c which offers three trace clock
> > > variants, with different performance/precision tradeoffs:
> > >
> > > trace_clock_local() [ for pure CPU-local tracers with no idle
> > > events. This is the fastest but least
> > > coherent tracing clock. ]
> > >
> > > trace_clock() [ intermediate, scalable clock with
> > > usable but imprecise global coherency. ]
> > >
> > > trace_clock_global() [ globally serialized, coherent clock.
> > > It is the slowest but most accurate variant. ]
> > >
> > > Tracing plugins can pick their choice. (This is relatively new
> > > code but you get the idea.)
> > >
> >
> > Hehe this reminds me of the trace clock thread I started a few
> > months ago on LKML. So you guys took over that work ? Nice :)
> > Is it based on the trace-clock patches I proposed back then ?
> > Ah, no. Well I guess we'll have to discuss this too. I agree
> > on the trace_clock_local/trace_clock/trace_clock_global
> > interface, it looks nice. The underlying implementation will
> > have to be discussed though.
>
> Beware: i found the assembly trace_clock() stuff you did back
> then rather ugly ;-) I dont think there's any easy solutions
> here, so i went for this palette of clocks.
>

Hi Ingo,

I agree for the palette of clocks to fit all needs. I wonder what
exactly you found ugly in the approach I took with my trace_clock()
implementation ? Maybe you could refresh my memory, I do not recall
writing any part of it in assembly.. ? But this is a whole different
topic. We can discuss this later.


> > > This approach works for all your other patches as well. A
> > > direct, constructive comparison and active work on unifying
> > > them is required.
> >
> > Yes, let's try to do it. Maybe it's better to start a new
> > thread with less CCs for this type of work ?
>
> Yeah. More finegrained steps are really needed.
>
> The least controversial bits would be the many tracepoints you
> identified in LTTng as interesting. Mind sending them separately
> so that we can make some progress?
>

OK, I'll work on it. Note however that I flipped my patchset around in
the past months : thinking that the tracer acceptance would be easier
than tracepoints. And now we are back at square one. Is it just me or I
have the funny feeling of acting like a dog running in circles after his
tail ? :)


> In the latest tracing code all tracepoints will show up
> automatically under /debug/tracing/events/ and can be used by
> user-space tools.
>

Hrm, the thing is : I strongly disagree with showing tracepoints to
userspace and with the fact that you embed the data serialization
"pseudo-callbacks" into the tracepoint headers. Here is why. Peter
Zijlstra convinced me that putting format strings directly in tracepoint
headers was a bad idea. First off, you end up requiring all tracers
which connect on the tracepoints to choose your event format description
if they ever want to benefit from it. It's a "all included" formula :
either the tracers use them or they cannot output "standard" trace
information.

Second point : the tracepoints are meant to be tied to the kernel
source. Putting those event descriptions in global headers seems like
the people responsible for writing the kernel code surrounding the
tracepoints will end up being responsible for updating those tracepoint
event format descriptions. I think this is an unacceptable maintainance
burden for the whole community. Only tracer-specific modules should
refuse to build whenever it does not match the inner kernel structures
anymore.

Third point : it's plainly ugly. If we look at your tracepoint example :


/*
* Tracepoint for task switches, performed by the scheduler:
*
* (NOTE: the 'rq' argument is not used by generic trace events,
* but used by the latency tracer plugin. )
*/
TRACE_EVENT(sched_switch,

TP_PROTO(struct rq *rq, struct task_struct *prev,
struct task_struct *next),

TP_ARGS(rq, prev, next),

TP_STRUCT__entry(
__array( char, prev_comm, TASK_COMM_LEN )
__field( pid_t, prev_pid )
__field( int, prev_prio )
__array( char, next_comm, TASK_COMM_LEN )
__field( pid_t, next_pid )
__field( int, next_prio )
),

TP_printk("task %s:%d [%d] ==> %s:%d [%d]",
__entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
__entry->next_comm, __entry->next_pid, __entry->next_prio),

TP_fast_assign(
memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN);
__entry->prev_pid = prev->pid;
__entry->prev_prio = prev->prio;
memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN);
__entry->next_pid = next->pid;
__entry->next_prio = next->prio;
)
);


I notice that you actually embed the "function" that converts between
the format string into a header macro declaration. Why don't we write
this in plain C ?

in include/trace/sched.h :

DECLARE_TRACE(sched_switch,
TPPROTO(struct rq *rq, struct task_struct *prev,
struct task_struct *next),
TPARGS(rq, prev, next));

in ltt/probes/kernel-trace.c :


void probe_sched_switch(struct rq *rq, struct task_struct *prev,
struct task_struct *next);

DEFINE_MARKER_TP(kernel, sched_schedule, sched_switch, probe_sched_switch,
"prev_pid %d next_pid %d prev_state #2d%ld");

notrace void probe_sched_switch(struct rq *rq, struct task_struct *prev,
struct task_struct *next)
{
struct marker *marker;
struct serialize_int_int_short data;

data.f1 = prev->pid;
data.f2 = next->pid;
data.f3 = prev->state;

marker = &GET_MARKER(kernel, sched_schedule);
ltt_specialized_trace(marker, marker->single.probe_private,
&data, serialize_sizeof(data), sizeof(int));
}

This way, if the content of task_struct ever changes, only the tracer
module will break, not code touching a global header.

Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-03-14 16:45:36

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC patch 00/41] LTTng 0.105 core for Linux 2.6.27-rc9


* Mathieu Desnoyers <[email protected]> wrote:

> * Ingo Molnar ([email protected]) wrote:
> >
> > * Mathieu Desnoyers <[email protected]> wrote:
> >
> > > > Let me give you a few examples of existing areas of overlap:
> > > >
> > > > > The corresponding git tree contains also the trace clock
> > > > > patches and the lttng instrumentation. The trace clock is
> > > > > required to use the tracer, but it can be used without the
> > > > > instrumentation : there is already a kprobes and userspace
> > > > > event support included in this patchset.
> > > >
> > > > The latest tracing tree includes
> > > > kernel/tracing/trace_clock.c which offers three trace clock
> > > > variants, with different performance/precision tradeoffs:
> > > >
> > > > trace_clock_local() [ for pure CPU-local tracers with no idle
> > > > events. This is the fastest but least
> > > > coherent tracing clock. ]
> > > >
> > > > trace_clock() [ intermediate, scalable clock with
> > > > usable but imprecise global coherency. ]
> > > >
> > > > trace_clock_global() [ globally serialized, coherent clock.
> > > > It is the slowest but most accurate variant. ]
> > > >
> > > > Tracing plugins can pick their choice. (This is relatively new
> > > > code but you get the idea.)
> > > >
> > >
> > > Hehe this reminds me of the trace clock thread I started a few
> > > months ago on LKML. So you guys took over that work ? Nice :)
> > > Is it based on the trace-clock patches I proposed back then ?
> > > Ah, no. Well I guess we'll have to discuss this too. I agree
> > > on the trace_clock_local/trace_clock/trace_clock_global
> > > interface, it looks nice. The underlying implementation will
> > > have to be discussed though.
> >
> > Beware: i found the assembly trace_clock() stuff you did back
> > then rather ugly ;-) I dont think there's any easy solutions
> > here, so i went for this palette of clocks.
> >
>
> Hi Ingo,
>
> I agree for the palette of clocks to fit all needs. I wonder
> what exactly you found ugly in the approach I took with my
> trace_clock() implementation ? Maybe you could refresh my
> memory, I do not recall writing any part of it in assembly.. ?
> But this is a whole different topic. We can discuss this
> later.

hm, it was months ago. Ok, it must have been this one:

http://lkml.org/lkml/2008/11/7/21
http://lkml.org/lkml/2008/11/7/23

indeed no assembly but almost ;-) What i found rather ugly were
the cnt32_to_63() complications.

Ingo

2009-03-14 16:59:51

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [ltt-dev] [RFC patch 00/41] LTTng 0.105 core for Linux 2.6.27-rc9

* Ingo Molnar ([email protected]) wrote:
>
> * Mathieu Desnoyers <[email protected]> wrote:
>
> > * Ingo Molnar ([email protected]) wrote:
> > >
> > > * Mathieu Desnoyers <[email protected]> wrote:
> > >
> > > > > Let me give you a few examples of existing areas of overlap:
> > > > >
> > > > > > The corresponding git tree contains also the trace clock
> > > > > > patches and the lttng instrumentation. The trace clock is
> > > > > > required to use the tracer, but it can be used without the
> > > > > > instrumentation : there is already a kprobes and userspace
> > > > > > event support included in this patchset.
> > > > >
> > > > > The latest tracing tree includes
> > > > > kernel/tracing/trace_clock.c which offers three trace clock
> > > > > variants, with different performance/precision tradeoffs:
> > > > >
> > > > > trace_clock_local() [ for pure CPU-local tracers with no idle
> > > > > events. This is the fastest but least
> > > > > coherent tracing clock. ]
> > > > >
> > > > > trace_clock() [ intermediate, scalable clock with
> > > > > usable but imprecise global coherency. ]
> > > > >
> > > > > trace_clock_global() [ globally serialized, coherent clock.
> > > > > It is the slowest but most accurate variant. ]
> > > > >
> > > > > Tracing plugins can pick their choice. (This is relatively new
> > > > > code but you get the idea.)
> > > > >
> > > >
> > > > Hehe this reminds me of the trace clock thread I started a few
> > > > months ago on LKML. So you guys took over that work ? Nice :)
> > > > Is it based on the trace-clock patches I proposed back then ?
> > > > Ah, no. Well I guess we'll have to discuss this too. I agree
> > > > on the trace_clock_local/trace_clock/trace_clock_global
> > > > interface, it looks nice. The underlying implementation will
> > > > have to be discussed though.
> > >
> > > Beware: i found the assembly trace_clock() stuff you did back
> > > then rather ugly ;-) I dont think there's any easy solutions
> > > here, so i went for this palette of clocks.
> > >
> >
> > Hi Ingo,
> >
> > I agree for the palette of clocks to fit all needs. I wonder
> > what exactly you found ugly in the approach I took with my
> > trace_clock() implementation ? Maybe you could refresh my
> > memory, I do not recall writing any part of it in assembly.. ?
> > But this is a whole different topic. We can discuss this
> > later.
>
> hm, it was months ago. Ok, it must have been this one:
>
> http://lkml.org/lkml/2008/11/7/21
> http://lkml.org/lkml/2008/11/7/23
>
> indeed no assembly but almost ;-) What i found rather ugly were
> the cnt32_to_63() complications.
>

The fact that I put a patch touching cnt32_to_63 back then was just a
way to point out how the current cnt32_to_63 implementation is broken
for SMP and should stay in UP-only architecture-specific code (that was
an answer to Peter Zijlstra's reuse concerns). Once I got agreement that
tracers should not be expected to use cnt32_to_63, I dropped any patch
touching this piece of infrastructure and stayed with my
trace-clock-32-to-64.c implementation, which is SMP-safe, scalable and
basically extends atomically (through a rcu-like algorithm) a N bit
clock to a full 64-bits clock. This is very, very useful for lots of
architectures. Is it that code you find ugly ?

Mathieu

> Ingo
>
> _______________________________________________
> ltt-dev mailing list
> [email protected]
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68