2009-09-02 12:19:10

by Xiao Guangrong

[permalink] [raw]
Subject: [tip:timers/tracing] hrtimer: Add tracepoint for hrtimers

Commit-ID: c6a2a1770245f654f35f60e1458d4356680f9519
Gitweb: http://git.kernel.org/tip/c6a2a1770245f654f35f60e1458d4356680f9519
Author: Xiao Guangrong <[email protected]>
AuthorDate: Mon, 10 Aug 2009 10:51:23 +0800
Committer: Thomas Gleixner <[email protected]>
CommitDate: Sat, 29 Aug 2009 14:10:06 +0200

hrtimer: Add tracepoint for hrtimers

Add tracepoints which cover the life cycle of a hrtimer. The
tracepoints are integrated with the already existing debug_object
debug points as far as possible.

[ tglx: Fixed comments, made output conistent, easier to read and
parse. Fixed output for 32bit archs which do not use the
scalar representation of ktime_t. Hand current time to
trace_hrtimer_expiry_entry instead of calling get_time()
inside of the trace assignment. ]

Signed-off-by: Xiao Guangrong <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Anton Blanchard <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Zhaolei <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>


---
include/trace/events/timer.h | 139 ++++++++++++++++++++++++++++++++++++++++++
kernel/hrtimer.c | 40 ++++++++++---
2 files changed, 171 insertions(+), 8 deletions(-)

diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h
index 725892a..df3c07f 100644
--- a/include/trace/events/timer.h
+++ b/include/trace/events/timer.h
@@ -5,6 +5,7 @@
#define _TRACE_TIMER_H

#include <linux/tracepoint.h>
+#include <linux/hrtimer.h>
#include <linux/timer.h>

/**
@@ -131,6 +132,144 @@ TRACE_EVENT(timer_cancel,
TP_printk("timer %p", __entry->timer)
);

+/**
+ * hrtimer_init - called when the hrtimer is initialized
+ * @timer: pointer to struct hrtimer
+ * @clockid: the hrtimers clock
+ * @mode: the hrtimers mode
+ */
+TRACE_EVENT(hrtimer_init,
+
+ TP_PROTO(struct hrtimer *timer, clockid_t clockid,
+ enum hrtimer_mode mode),
+
+ TP_ARGS(timer, clockid, mode),
+
+ TP_STRUCT__entry(
+ __field( void *, timer )
+ __field( clockid_t, clockid )
+ __field( enum hrtimer_mode, mode )
+ ),
+
+ TP_fast_assign(
+ __entry->timer = timer;
+ __entry->clockid = clockid;
+ __entry->mode = mode;
+ ),
+
+ TP_printk("hrtimer %p, clockid %s, mode %s", __entry->timer,
+ __entry->clockid == CLOCK_REALTIME ?
+ "CLOCK_REALTIME" : "CLOCK_MONOTONIC",
+ __entry->mode == HRTIMER_MODE_ABS ?
+ "HRTIMER_MODE_ABS" : "HRTIMER_MODE_REL")
+);
+
+/**
+ * hrtimer_start - called when the hrtimer is started
+ * @timer: pointer to struct hrtimer
+ */
+TRACE_EVENT(hrtimer_start,
+
+ TP_PROTO(struct hrtimer *timer),
+
+ TP_ARGS(timer),
+
+ TP_STRUCT__entry(
+ __field( void *, timer )
+ __field( void *, function )
+ __field( s64, expires )
+ __field( s64, softexpires )
+ ),
+
+ TP_fast_assign(
+ __entry->timer = timer;
+ __entry->function = timer->function;
+ __entry->expires = hrtimer_get_expires(timer).tv64;
+ __entry->softexpires = hrtimer_get_softexpires(timer).tv64;
+ ),
+
+ TP_printk("hrtimer %p, func %pf, expires %llu, softexpires %llu",
+ __entry->timer, __entry->function,
+ (unsigned long long)ktime_to_ns((ktime_t) {
+ .tv64 = __entry->expires }),
+ (unsigned long long)ktime_to_ns((ktime_t) {
+ .tv64 = __entry->softexpires }))
+);
+
+/**
+ * htimmer_expire_entry - called immediately before the hrtimer callback
+ * @timer: pointer to struct hrtimer
+ * @now: pointer to variable which contains current time of the
+ * timers base.
+ *
+ * Allows to determine the timer latency.
+ */
+TRACE_EVENT(hrtimer_expire_entry,
+
+ TP_PROTO(struct hrtimer *timer, ktime_t *now),
+
+ TP_ARGS(timer, now),
+
+ TP_STRUCT__entry(
+ __field( void *, timer )
+ __field( s64, now )
+ ),
+
+ TP_fast_assign(
+ __entry->timer = timer;
+ __entry->now = now->tv64;
+ ),
+
+ TP_printk("hrtimer %p, now %llu", __entry->timer,
+ (unsigned long long)ktime_to_ns((ktime_t) {
+ .tv64 = __entry->now }))
+ );
+
+/**
+ * hrtimer_expire_exit - called immediately after the hrtimer callback returns
+ * @timer: pointer to struct hrtimer
+ *
+ * When used in combination with the hrtimer_expire_entry tracepoint we can
+ * determine the runtime of the callback function.
+ */
+TRACE_EVENT(hrtimer_expire_exit,
+
+ TP_PROTO(struct hrtimer *timer),
+
+ TP_ARGS(timer),
+
+ TP_STRUCT__entry(
+ __field( void *, timer )
+ ),
+
+ TP_fast_assign(
+ __entry->timer = timer;
+ ),
+
+ TP_printk("hrtimer %p", __entry->timer)
+);
+
+/**
+ * hrtimer_cancel - called when the hrtimer is canceled
+ * @timer: pointer to struct hrtimer
+ */
+TRACE_EVENT(hrtimer_cancel,
+
+ TP_PROTO(struct hrtimer *timer),
+
+ TP_ARGS(timer),
+
+ TP_STRUCT__entry(
+ __field( void *, timer )
+ ),
+
+ TP_fast_assign(
+ __entry->timer = timer;
+ ),
+
+ TP_printk("hrtimer %p", __entry->timer)
+);
+
#endif /* _TRACE_TIMER_H */

/* This part must be outside protection */
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index e2f91ec..b44d1b0 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -48,6 +48,8 @@

#include <asm/uaccess.h>

+#include <trace/events/timer.h>
+
/*
* The timer bases:
*
@@ -441,6 +443,26 @@ static inline void debug_hrtimer_activate(struct hrtimer *timer) { }
static inline void debug_hrtimer_deactivate(struct hrtimer *timer) { }
#endif

+static inline void
+debug_init(struct hrtimer *timer, clockid_t clockid,
+ enum hrtimer_mode mode)
+{
+ debug_hrtimer_init(timer);
+ trace_hrtimer_init(timer, clockid, mode);
+}
+
+static inline void debug_activate(struct hrtimer *timer)
+{
+ debug_hrtimer_activate(timer);
+ trace_hrtimer_start(timer);
+}
+
+static inline void debug_deactivate(struct hrtimer *timer)
+{
+ debug_hrtimer_deactivate(timer);
+ trace_hrtimer_cancel(timer);
+}
+
/* High resolution timer related functions */
#ifdef CONFIG_HIGH_RES_TIMERS

@@ -797,7 +819,7 @@ static int enqueue_hrtimer(struct hrtimer *timer,
struct hrtimer *entry;
int leftmost = 1;

- debug_hrtimer_activate(timer);
+ debug_activate(timer);

/*
* Find the right place in the rbtree:
@@ -883,7 +905,7 @@ remove_hrtimer(struct hrtimer *timer, struct hrtimer_clock_base *base)
* reprogramming happens in the interrupt handler. This is a
* rare case and less expensive than a smp call.
*/
- debug_hrtimer_deactivate(timer);
+ debug_deactivate(timer);
timer_stats_hrtimer_clear_start_info(timer);
reprogram = base->cpu_base == &__get_cpu_var(hrtimer_bases);
__remove_hrtimer(timer, base, HRTIMER_STATE_INACTIVE,
@@ -1116,7 +1138,7 @@ static void __hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
void hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
enum hrtimer_mode mode)
{
- debug_hrtimer_init(timer);
+ debug_init(timer, clock_id, mode);
__hrtimer_init(timer, clock_id, mode);
}
EXPORT_SYMBOL_GPL(hrtimer_init);
@@ -1140,7 +1162,7 @@ int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp)
}
EXPORT_SYMBOL_GPL(hrtimer_get_res);

-static void __run_hrtimer(struct hrtimer *timer)
+static void __run_hrtimer(struct hrtimer *timer, ktime_t *now)
{
struct hrtimer_clock_base *base = timer->base;
struct hrtimer_cpu_base *cpu_base = base->cpu_base;
@@ -1149,7 +1171,7 @@ static void __run_hrtimer(struct hrtimer *timer)

WARN_ON(!irqs_disabled());

- debug_hrtimer_deactivate(timer);
+ debug_deactivate(timer);
__remove_hrtimer(timer, base, HRTIMER_STATE_CALLBACK, 0);
timer_stats_account_hrtimer(timer);
fn = timer->function;
@@ -1160,7 +1182,9 @@ static void __run_hrtimer(struct hrtimer *timer)
* the timer base.
*/
spin_unlock(&cpu_base->lock);
+ trace_hrtimer_expire_entry(timer, now);
restart = fn(timer);
+ trace_hrtimer_expire_exit(timer);
spin_lock(&cpu_base->lock);

/*
@@ -1271,7 +1295,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
break;
}

- __run_hrtimer(timer);
+ __run_hrtimer(timer, &basenow);
}
base++;
}
@@ -1393,7 +1417,7 @@ void hrtimer_run_queues(void)
hrtimer_get_expires_tv64(timer))
break;

- __run_hrtimer(timer);
+ __run_hrtimer(timer, &base->softirq_time);
}
spin_unlock(&cpu_base->lock);
}
@@ -1569,7 +1593,7 @@ static void migrate_hrtimer_list(struct hrtimer_clock_base *old_base,
while ((node = rb_first(&old_base->active))) {
timer = rb_entry(node, struct hrtimer, node);
BUG_ON(hrtimer_callback_running(timer));
- debug_hrtimer_deactivate(timer);
+ debug_deactivate(timer);

/*
* Mark it as STATE_MIGRATE not INACTIVE otherwise the


2009-10-13 03:27:11

by Steven Rostedt

[permalink] [raw]
Subject: Re: [tip:timers/tracing] hrtimer: Add tracepoint for hrtimers

On Wed, 2009-09-02 at 12:18 +0000, tip-bot for Xiao Guangrong wrote:

> +/**
> + * hrtimer_start - called when the hrtimer is started
> + * @timer: pointer to struct hrtimer
> + */
> +TRACE_EVENT(hrtimer_start,
> +
> + TP_PROTO(struct hrtimer *timer),
> +
> + TP_ARGS(timer),
> +
> + TP_STRUCT__entry(
> + __field( void *, timer )
> + __field( void *, function )
> + __field( s64, expires )
> + __field( s64, softexpires )
> + ),
> +
> + TP_fast_assign(
> + __entry->timer = timer;
> + __entry->function = timer->function;
> + __entry->expires = hrtimer_get_expires(timer).tv64;
> + __entry->softexpires = hrtimer_get_softexpires(timer).tv64;
> + ),
> +
> + TP_printk("hrtimer %p, func %pf, expires %llu, softexpires %llu",
> + __entry->timer, __entry->function,
> + (unsigned long long)ktime_to_ns((ktime_t) {
> + .tv64 = __entry->expires }),
> + (unsigned long long)ktime_to_ns((ktime_t) {
> + .tv64 = __entry->softexpires }))
> +);
> +

I was just fixing up the trace-cmd parser (which is also used for perf)
and came across this code. That TP_printk shows up in the format file
as:

print fmt: "hrtimer %p, func %pf, expires %llu, softexpires %llu", REC->timer, REC->function, (unsigned long long)(((ktime_t) { .tv64 = REC->expires }).tv64), (unsigned long long)(((ktime_t) { .tv64 = REC->softexpires }).tv64)

Now I'm trying to parse this with a tool so that we can have something
that can read the binary output, and be able to easily figure it out by
reading the format files. But trying to parse:

(unsigned long long)(((ktime_t) { .tv64 = REC->expires }).tv64)

Is not easy. It's basically implementing a C interpreter :-(

Is there anyway we can change that TRACE_EVENT to give us a better,
trace printk output?

-- Steve

2009-10-13 07:09:58

by Ingo Molnar

[permalink] [raw]
Subject: Re: [tip:timers/tracing] hrtimer: Add tracepoint for hrtimers


* Steven Rostedt <[email protected]> wrote:

> On Wed, 2009-09-02 at 12:18 +0000, tip-bot for Xiao Guangrong wrote:
>
> > +/**
> > + * hrtimer_start - called when the hrtimer is started
> > + * @timer: pointer to struct hrtimer
> > + */
> > +TRACE_EVENT(hrtimer_start,
> > +
> > + TP_PROTO(struct hrtimer *timer),
> > +
> > + TP_ARGS(timer),
> > +
> > + TP_STRUCT__entry(
> > + __field( void *, timer )
> > + __field( void *, function )
> > + __field( s64, expires )
> > + __field( s64, softexpires )
> > + ),
> > +
> > + TP_fast_assign(
> > + __entry->timer = timer;
> > + __entry->function = timer->function;
> > + __entry->expires = hrtimer_get_expires(timer).tv64;
> > + __entry->softexpires = hrtimer_get_softexpires(timer).tv64;
> > + ),
> > +
> > + TP_printk("hrtimer %p, func %pf, expires %llu, softexpires %llu",
> > + __entry->timer, __entry->function,
> > + (unsigned long long)ktime_to_ns((ktime_t) {
> > + .tv64 = __entry->expires }),
> > + (unsigned long long)ktime_to_ns((ktime_t) {
> > + .tv64 = __entry->softexpires }))
> > +);
> > +
>
> I was just fixing up the trace-cmd parser (which is also used for perf)
> and came across this code. That TP_printk shows up in the format file
> as:
>
> print fmt: "hrtimer %p, func %pf, expires %llu, softexpires %llu", REC->timer, REC->function, (unsigned long long)(((ktime_t) { .tv64 = REC->expires }).tv64), (unsigned long long)(((ktime_t) { .tv64 = REC->softexpires }).tv64)
>
> Now I'm trying to parse this with a tool so that we can have something
> that can read the binary output, and be able to easily figure it out by
> reading the format files. But trying to parse:
>
> (unsigned long long)(((ktime_t) { .tv64 = REC->expires }).tv64)
>
> Is not easy. It's basically implementing a C interpreter :-(

Btw., what i suggested quite some time ago was that we should bind
tracepoints by emitting C source code stubs, which tools can then build
and link in, using gcc.

Ingo

2009-10-13 07:49:05

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [tip:timers/tracing] hrtimer: Add tracepoint for hrtimers

On Mon, 2009-10-12 at 23:25 -0400, Steven Rostedt wrote:
> > + TP_printk("hrtimer %p, func %pf, expires %llu, softexpires %llu",
> > + __entry->timer, __entry->function,
> > + (unsigned long long)ktime_to_ns((ktime_t) {
> > + .tv64 = __entry->expires }),
> > + (unsigned long long)ktime_to_ns((ktime_t) {
> > + .tv64 = __entry->softexpires }))
> > +);
> > +
>
> I was just fixing up the trace-cmd parser (which is also used for perf)
> and came across this code. That TP_printk shows up in the format file
> as:
>
> print fmt: "hrtimer %p, func %pf, expires %llu, softexpires %llu", REC->timer, REC->function, (unsigned long long)(((ktime_t) { .tv64 = REC->expires }).tv64), (unsigned long long)(((ktime_t) { .tv64 = REC->softexpires }).tv64)
>
> Now I'm trying to parse this with a tool so that we can have something
> that can read the binary output, and be able to easily figure it out by
> reading the format files. But trying to parse:
>
> (unsigned long long)(((ktime_t) { .tv64 = REC->expires }).tv64)
>
> Is not easy. It's basically implementing a C interpreter :-(
>
> Is there anyway we can change that TRACE_EVENT to give us a better,
> trace printk output?

LOL

Either convince tglx that ktime_t really ought to die for good (yeah, I
know arm), or specialize the whole printout for 32 vs 64 ktime_t.

union ktime {
s64 tv64;
#if BITS_PER_LONG != 64 && !defined(CONFIG_KTIME_SCALAR)
struct {
# ifdef __BIG_ENDIAN
s32 sec, nsec;
# else
s32 nsec, sec;
# endif
} tv;
#endif
};

2009-10-13 13:19:04

by Steven Rostedt

[permalink] [raw]
Subject: Re: [tip:timers/tracing] hrtimer: Add tracepoint for hrtimers

On Tue, 2009-10-13 at 09:08 +0200, Ingo Molnar wrote:
> * Steven Rostedt <[email protected]> wrote:
>

> > (unsigned long long)(((ktime_t) { .tv64 = REC->expires }).tv64)
> >
> > Is not easy. It's basically implementing a C interpreter :-(
>
> Btw., what i suggested quite some time ago was that we should bind
> tracepoints by emitting C source code stubs, which tools can then build
> and link in, using gcc.

Yeah, and I thought about that too. But that kills any chance of running
the trace on one box (non x86) and reading it on another (x86). And that
is one of my goals for this.

-- Steve

2009-10-13 13:20:24

by Steven Rostedt

[permalink] [raw]
Subject: Re: [tip:timers/tracing] hrtimer: Add tracepoint for hrtimers

On Tue, 2009-10-13 at 09:17 -0400, Steven Rostedt wrote:
> On Tue, 2009-10-13 at 09:08 +0200, Ingo Molnar wrote:
> > * Steven Rostedt <[email protected]> wrote:
> >
>
> > > (unsigned long long)(((ktime_t) { .tv64 = REC->expires }).tv64)
> > >
> > > Is not easy. It's basically implementing a C interpreter :-(
> >
> > Btw., what i suggested quite some time ago was that we should bind
> > tracepoints by emitting C source code stubs, which tools can then build
> > and link in, using gcc.
>
> Yeah, and I thought about that too. But that kills any chance of running
> the trace on one box (non x86) and reading it on another (x86). And that
> is one of my goals for this.

I'd love to see the kernel code for having this also understand ktime_t.

-- Steve

2009-10-13 13:28:03

by Ingo Molnar

[permalink] [raw]
Subject: Re: [tip:timers/tracing] hrtimer: Add tracepoint for hrtimers


* Steven Rostedt <[email protected]> wrote:

> On Tue, 2009-10-13 at 09:08 +0200, Ingo Molnar wrote:
> > * Steven Rostedt <[email protected]> wrote:
> >
>
> > > (unsigned long long)(((ktime_t) { .tv64 = REC->expires }).tv64)
> > >
> > > Is not easy. It's basically implementing a C interpreter :-(
> >
> > Btw., what i suggested quite some time ago was that we should bind
> > tracepoints by emitting C source code stubs, which tools can then build
> > and link in, using gcc.
>
> Yeah, and I thought about that too. But that kills any chance of
> running the trace on one box (non x86) and reading it on another
> (x86). And that is one of my goals for this.

Why does it kill that chance?

Ingo

2009-10-13 14:04:38

by Steven Rostedt

[permalink] [raw]
Subject: Re: [tip:timers/tracing] hrtimer: Add tracepoint for hrtimers

On Tue, 2009-10-13 at 15:26 +0200, Ingo Molnar wrote:
> * Steven Rostedt <[email protected]> wrote:
>
> > On Tue, 2009-10-13 at 09:08 +0200, Ingo Molnar wrote:
> > > * Steven Rostedt <[email protected]> wrote:
> > >
> >
> > > > (unsigned long long)(((ktime_t) { .tv64 = REC->expires }).tv64)
> > > >
> > > > Is not easy. It's basically implementing a C interpreter :-(
> > >
> > > Btw., what i suggested quite some time ago was that we should bind
> > > tracepoints by emitting C source code stubs, which tools can then build
> > > and link in, using gcc.
> >
> > Yeah, and I thought about that too. But that kills any chance of
> > running the trace on one box (non x86) and reading it on another
> > (x86). And that is one of my goals for this.
>
> Why does it kill that chance?

Ah, I was thinking binary libraries from the trace points. But I missed
you said C source code.

But still, we need to build a way to convert big endian to little, as
well as perhaps converting 32bit to 64 bit longs. As well as vice versa.

We also need a way to export all structures (here ktime_t) as well as
enums. Of course we need to export enums now anyway, since the tools
don't know how to convert them now.

-- Steve

2009-10-13 15:36:33

by Mathieu Desnoyers

[permalink] [raw]
Subject: [RFC] Trace types registry

* Steven Rostedt ([email protected]) wrote:
> On Tue, 2009-10-13 at 15:26 +0200, Ingo Molnar wrote:
> > * Steven Rostedt <[email protected]> wrote:
> >
> > > On Tue, 2009-10-13 at 09:08 +0200, Ingo Molnar wrote:
> > > > * Steven Rostedt <[email protected]> wrote:
> > > >
> > >
> > > > > (unsigned long long)(((ktime_t) { .tv64 = REC->expires }).tv64)
> > > > >
> > > > > Is not easy. It's basically implementing a C interpreter :-(
> > > >
> > > > Btw., what i suggested quite some time ago was that we should bind
> > > > tracepoints by emitting C source code stubs, which tools can then build
> > > > and link in, using gcc.
> > >
> > > Yeah, and I thought about that too. But that kills any chance of
> > > running the trace on one box (non x86) and reading it on another
> > > (x86). And that is one of my goals for this.
> >
> > Why does it kill that chance?
>
> Ah, I was thinking binary libraries from the trace points. But I missed
> you said C source code.
>
> But still, we need to build a way to convert big endian to little, as
> well as perhaps converting 32bit to 64 bit longs. As well as vice versa.
>
> We also need a way to export all structures (here ktime_t) as well as
> enums. Of course we need to export enums now anyway, since the tools
> don't know how to convert them now.
>
> -- Steve

I fully agree with Steve here. Embedded system developers, for instance,
will typically export traces taken on small embedded devices to a
completely different architecture for analysis and visualization.

The problem I see with exporting this as C code is that it does not
represent the type size and endianness of the traced kernel.

I think I may have a solution that could deal with these compound types.
Comments are welcome.

The idea is to create a type registry, that could be expressed in either
textual or binary format. (binary format simply requires a header with
the architecture endianness and basic types sizes expressed)

The registry, which would be saved along with a trace, could look like
what follows (written in C here, but would be created dynamically). Note
that this registry is only saved once with a trace, so we do not care
about compactness. What we really do care about is to make it easy to
parse a trace coming from an alien architecture.

enum entry_type { T_SCALAR, T_COMPOUND, T_EVENT };

struct event_field {
u32 type_id; /* maps to a type */
char name[MAX_FIELD_NAME];
};

union compound_field {
u32 type_id; /* Unnamed fields (array, sequence, string) */
struct event_field field; /* Named fields (struct, union) */
}

/* Use "type_id" lookup to find if type is scalar or compound */
union event_type {
struct event_type_scalar scalar;
struct event_type_compound compound;
};

struct event_type_scalar {
struct entry_type etype;
u32 type_id; /* Allocated dynamically */
char name[MAX_TYPE_NAME];
u32 size;
u8 host_byte_order;
u8 signed;
u8 format; /* one of 256 possible print formats */
u8 float; /* useful for user-space tracing */
};

/* array, sequence, struct, union, string */
struct event_type_compound {
struct entry_type etype;
u32 type_id; /* Allocated dynamically */
char name[MAX_TYPE_NAME];
enum { T_ARRAY, T_SEQ, T_STRUCT, T_UNION, T_STRING } type;
u32 nr_fields;
union compound_field[nr_fields] field;
};

struct event {
struct entry_type etype;
/* map to the event */
unsigned int group_id;
unsigned int event_id;
char group_name[MAX_GROUP_NAME];
char event_name[MAX_EVENT_NAME];
u32 type_id;
};

stuct trace_registry {
struct regheader {
u32 magic_num; /* detect endianness */
u8 int_size; /* basic types */
u8 long_size;
u8 size_t_size;
u8 off_t_size;
u8 ptr_diff_t_size;
}

struct event_type_scalar {
.etype = T_SCALAR,
.type_id = lookup_or_create_type_id(char);
.name = "char",
.size = 1,
.host_byte_order = 1,
.signed = 0,
.format = 'c',
.float = 0,
}

struct event_type_compound = {
.etype = T_COMPOUND,
.type_id = lookup_or_create_type_id(char [TASK_COMM_LEN]),
.name = "char [TASK_COMM_LEN]",
.type = T_ARRAY,
.nr_fields = TASK_COMM_LEN,
.field[0].type_id = lookup_type_id("char");
}

struct event_type_scalar {
.etype = T_SCALAR,
.type_id = lookup_or_create_type_id(pid_t),
.name = "pid_t",
.size = sizeof(pid_t),
.host_byte_order = 1,
.signed = 0,
.format = 'u',
.float = 0,
}

struct event_type_scalar {
.etype = T_SCALAR,
.type_id = lookup_or_create_type_id(int),
.name = "int",
.size = sizeof(int),
.host_byte_order = 1,
.signed = 1,
.format = 'd',
.float = 0,
}

struct event_type_compound = {
.etype = T_COMPOUND,
.type_id = lookup_or_create_type_id(
{
char [TASK_COMM_LEN];
pid_t;
int;
char;
pid_t;
int;
});

.name = "{
char [TASK_COMM_LEN];
pid_t;
int;
char;
pid_t;
int;
}",
.type = T_STRUCT,
.nr_fields = 6,
.field[0].field.type_id = lookup_type_id(char [TASK_COMM_LEN]),
.field[0].field.name = "prev_comm",
.field[1].field.type_id = lookup_type_id(pid_t),
.field[1].field.name = "prev_pid",
.field[2].field.type_id = lookup_type_id(int),
.field[2].field.name = "prev_prio",
.field[2].field.type_id = lookup_type_id(char [TASK_COMM_LEN]),
.field[2].field.name = "next_comm",
.field[2].field.type_id = lookup_type_id(pid_t),
.field[2].field.name = "next_pid",
.field[2].field.type_id = lookup_type_id(int),
.field[2].field.name = "next_prio",
}

struct event sched_switch_event {
.etype = T_EVENT,
.group_id = lookup_event_group("groupname"),
.event_id = lookup_event_id("evname"),
.groupname = "groupname",
.eventname = "evname",
.type_id = lookup_type_id(
{
char [TASK_COMM_lEN];
pid_t;
int;
char;
pid_t;
int;
});

.......

}
}

Note: lookup_type_id() and lookup_or_create_type_id() could be a macro
that stringify their parameters.

Steve, do you think this could be created by TRACE_EVENT() automatically ?
Basically, each field and event would try to register its types at
module load time. Registration would allocate a unique ID to each type
and a unique ID to each group and event.

When "unrolling" the registry at trace analysis time, these IDs would be
used to populate a hash table, so the event, group and type lookups can
be performed.

Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-10-13 18:42:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: [tip:timers/tracing] hrtimer: Add tracepoint for hrtimers


* Steven Rostedt <[email protected]> wrote:

> On Tue, 2009-10-13 at 15:26 +0200, Ingo Molnar wrote:
> > * Steven Rostedt <[email protected]> wrote:
> >
> > > On Tue, 2009-10-13 at 09:08 +0200, Ingo Molnar wrote:
> > > > * Steven Rostedt <[email protected]> wrote:
> > > >
> > >
> > > > > (unsigned long long)(((ktime_t) { .tv64 = REC->expires }).tv64)
> > > > >
> > > > > Is not easy. It's basically implementing a C interpreter :-(
> > > >
> > > > Btw., what i suggested quite some time ago was that we should bind
> > > > tracepoints by emitting C source code stubs, which tools can then build
> > > > and link in, using gcc.
> > >
> > > Yeah, and I thought about that too. But that kills any chance of
> > > running the trace on one box (non x86) and reading it on another
> > > (x86). And that is one of my goals for this.
> >
> > Why does it kill that chance?
>
> Ah, I was thinking binary libraries from the trace points. But I
> missed you said C source code.
>
> But still, we need to build a way to convert big endian to little, as
> well as perhaps converting 32bit to 64 bit longs. As well as vice
> versa.

Exporting the print line as C in a convenient way does not in any way
make it more difficult to define endianness and bit size - we have to do
that anyway. It's largely orthogonal.

Ingo

2009-10-13 18:58:15

by Steven Rostedt

[permalink] [raw]
Subject: Re: [tip:timers/tracing] hrtimer: Add tracepoint for hrtimers

On Tue, 2009-10-13 at 20:41 +0200, Ingo Molnar wrote:

> > But still, we need to build a way to convert big endian to little, as
> > well as perhaps converting 32bit to 64 bit longs. As well as vice
> > versa.
>
> Exporting the print line as C in a convenient way does not in any way
> make it more difficult to define endianness and bit size - we have to do
> that anyway. It's largely orthogonal.

The way I do it now, is basically making a C interpreter. This can
easily handle word size and endianess (I have those exported too, and
can handle differences between host and file).

But to export the "printf" as a C function, that C function will need to
know a way what size a long is, as well as how to convert endianess.

I'm not really sure how to go about doing something like this
automatically. Every event will need its own C function to convert.
Sounds more like what Mathieu was describing.

We still don't have a way to handle enums and structures (like ktime).

-- Steve