2009-06-09 01:49:07

by Steven Rostedt

[permalink] [raw]
Subject: [RFC PATCH 2/5] tracing/events: nicer print format for parsing

From: Steven Rostedt <[email protected]>

The current print format that is exported to userspace is simply a
copy of the printk format used to output the data. It would take a
full C parser to parse it. But as more tools are made to read the
binary data from ftrace, the larger the need for a nice parsing
format to facilitate tools in reading the binary buffer.

For example we currently have:

irq_handler_entry:
print fmt: "irq=%d handler=%s", REC->irq, (char *)((void *)REC + REC->__data_loc
_name)

softirq_entry:
print fmt: "softirq=%d action=%s", REC->vec, ({ static const struct trace_print_
flags symbols[] = { { HI_SOFTIRQ, "HI" }, { TIMER_SOFTIRQ, "TIMER" }, { NET_TX_S
OFTIRQ, "NET_TX" }, { NET_RX_SOFTIRQ, "NET_RX" }, { BLOCK_SOFTIRQ, "BLOCK" }, {
TASKLET_SOFTIRQ, "TASKLET" }, { SCHED_SOFTIRQ, "SCHED" }, { HRTIMER_SOFTIRQ, "HR
TIMER" }, { RCU_SOFTIRQ, "RCU" }, { -1, ((void *)0) }}; ftrace_print_symbols_seq
(p, REC->vec, symbols); })

kmalloc:
print fmt: "call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s", RE
C->call_site, REC->ptr, REC->bytes_req, REC->bytes_alloc, (REC->gfp_flags) ? ({
static const struct trace_print_flags flags[] = { {(unsigned long)(((gfp_t)0x10u
) | ((gfp_t)0x40u) | ((gfp_t)0x80u) | ((gfp_t)0x20000u) | ((gfp_t)0x02u) | ((gfp
_t)0x100000u)), "GFP_HIGHUSER_MOVABLE"}, {(unsigned long)(((gfp_t)0x10u) | ((gfp
_t)0x40u) | ((gfp_t)0x80u) | ((gfp_t)0x20000u) | ((gfp_t)0x02u)), "GFP_HIGHUSER"
}, {(unsigned long)(((gfp_t)0x10u) | ((gfp_t)0x40u) | ((gfp_t)0x80u) | ((gfp_t)0
x20000u)), "GFP_USER"}, {(unsigned long)(((gfp_t)0x10u) | ((gfp_t)0x40u) | ((gfp
_t)0x80u) | ((gfp_t)0x80000u)), "GFP_TEMPORARY"}, {(unsigned long)(((gfp_t)0x10u
) | ((gfp_t)0x40u) | ((gfp_t)0x80u)), "GFP_KERNEL"}, {(unsigned long)(((gfp_t)0x
10u) | ((gfp_t)0x40u)), "GFP_NOFS"}, {(unsigned long)(((gfp_t)0x20u)), "GFP_ATOM
IC"}, {(unsigned long)(((gfp_t)0x10u)), "GFP_NOIO"}, {(unsigned long)((gfp_t)0x2
0u), "GFP_HIGH"}, {(unsigned long)((gfp_t)0x10u), "GFP_WAIT"}, {(unsigned long)(
(gfp_t)0x40u), "GFP_IO"}, {(unsigned long)((gfp_t)0x100u), "GFP_COLD"}, {(unsign
ed long)((gfp_t)0x200u), "GFP_NOWARN"}, {(unsigned long)((gfp_t)0x400u), "GFP_RE
PEAT"}, {(unsigned long)((gfp_t)0x800u), "GFP_NOFAIL"}, {(unsigned long)((gfp_t)
0x1000u), "GFP_NORETRY"}, {(unsigned long)((gfp_t)0x4000u), "GFP_COMP"}, {(unsig
ned long)((gfp_t)0x8000u), "GFP_ZERO"}, {(unsigned long)((gfp_t)0x10000u), "GFP_
NOMEMALLOC"}, {(unsigned long)((gfp_t)0x20000u), "GFP_HARDWALL"}, {(unsigned lon
g)((gfp_t)0x40000u), "GFP_THISNODE"}, {(unsigned long)((gfp_t)0x80000u), "GFP_RE
CLAIMABLE"}, {(unsigned long)((gfp_t)0x100000u), "GFP_MOVABLE"}, { -1, ((void *)
0) }}; ftrace_print_flags_seq(p, "|", REC->gfp_flags, flags); }) : "GFP_NOWAIT"

The language that is added by this patch is of the following:

* FMT := constant string FMT | COMMAND FMT | empty
* COMMAND := <TYPE:FIELD> | <mask:FIELD:DELIM:MASKS> | <sym:FIELD:SYMBOLS> |
* <if:FIELD:TRUE:FALSE>
* TYPE := int | hex | ptr | string | strarray
* FIELD := defined by the event structure
* MASKS := MASK=NAME,MASKS | MASK=NAME
* MASK := the bit mask to match
* DELIM := delimiter to separate the fields. None and ':' are both allowed
* SYMBOLS := SYM=NAME,SYMBOLS | SYM=NAME
* SYM := the symbol value to test against
* TRUE := print when field is non zero
* FALSE := print when field is zero or NULL
* NAME := the name to write when a match is found
*
* A '\<' would print '<'

The above examples would then look like:

irq_handler_entry:
format: irq=<int:irq> handler=<string:name>

softirq_entry:
format: softirq=<int:vec> action=<sym:vec:0=HI,1=TIMER,2=NET_TX,3=NET_RX,4=BLOCK,5=TASKLET,6=SCHED,7=HRTIMER,8=RCU

kmalloc:
format: call_site=<hex:call_site> ptr=<ptr:ptr> bytes_req=<int:bytes_req> bytes_alloc=<int:bytes_alloc> gfp_flags=<mask:gfp_flags:|:0=GFP_NOWAIT,0x1200d2=GFP_HIGHUSER_MOVABLE,0x200d2=GFP_HIGHUSER,0x200d0=GFP_USER,0x800d0=GFP_TEMPORARY,0xd0=GFP_KERNEL,0x50=GFP_NOFS,0x20=GFP_ATOMIC,0x10=GFP_NOIO,0x20=GFP_HIGH,0x10=GFP_WAIT,0x40=GFP_IO,0x100=GFP_COLD,0x200=GFP_NOWARN,0x400=GFP_REPEAT,0x800=GFP_NOFAIL,0x1000=GFP_NORETRY,0x4000=GFP_COMP,0x8000=GFP_ZERO,0x10000=GFP_NOMEMALLOC,0x20000=GFP_HARDWALL,0x40000=GFP_THISNODE,0x80000=GFP_RECLAIMABLE,0x100000=GFP_MOVABLE

The above "mask" command takes '0' as a special mask that should be done only in the beginning. It will write that symbol out when the mask is zero.

Another nice thing about this change set is that it can live together with
the current print format. The old version will show "print fmt:" in
the output file, where as the new version will use "format:" as shown
in the above examples.

Both may be used, but it would be preferable to use the new language.
If the new language is not adequate for a new trace point we can always
add new types. Userspace tools should just ignore types it does not
understand.

Signed-off-by: Steven Rostedt <[email protected]>
---
include/linux/ftrace_event.h | 10 +
include/trace/ftrace.h | 22 ++-
kernel/trace/Makefile | 1 +
kernel/trace/trace_read_binary.c | 674 ++++++++++++++++++++++++++++++++++++++
4 files changed, 705 insertions(+), 2 deletions(-)
create mode 100644 kernel/trace/trace_read_binary.c

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 5c093ff..f1b59d3 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -119,6 +119,9 @@ struct ftrace_event_call {
void *filter;
void *mod;

+ struct list_head print_info;
+ const char *print_text;
+
#ifdef CONFIG_EVENT_PROFILE
atomic_t profile_count;
int (*profile_enable)(struct ftrace_event_call *);
@@ -136,6 +139,13 @@ extern int filter_current_check_discard(struct ftrace_event_call *call,
void *rec,
struct ring_buffer_event *event);

+extern char *ftrace_read_binary(struct trace_seq *p,
+ struct ftrace_event_call *event,
+ struct trace_entry *entry);
+extern int ftrace_initialize_print(struct ftrace_event_call *event,
+ const char *fmt, ...)
+ __attribute__ ((format (printf, 2, 3)));
+
extern int trace_define_field(struct ftrace_event_call *call, char *type,
char *name, int offset, int size, int is_signed);

diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 40ede4d..e3370c5 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -124,6 +124,10 @@
#undef TP_printk
#define TP_printk(fmt, args...) fmt "\n", args

+#undef TP_FORMAT
+#define TP_FORMAT(fmt, args...) \
+ "%s\n", ftrace_read_binary(p, event_call, entry)
+
#undef __get_dynamic_array
#define __get_dynamic_array(field) \
((void *)__entry + __entry->__data_loc_##field)
@@ -152,6 +156,7 @@
enum print_line_t \
ftrace_raw_output_##call(struct trace_iterator *iter, int flags) \
{ \
+ struct ftrace_event_call *event_call __maybe_unused = &event_##call; \
struct trace_seq *s = &iter->seq; \
struct ftrace_raw_##call *field; \
struct trace_entry *entry; \
@@ -234,7 +239,10 @@ ftrace_raw_output_##call(struct trace_iterator *iter, int flags) \
#define __entry REC

#undef TP_printk
-#define TP_printk(fmt, args...) "%s, %s\n", #fmt, __stringify(args)
+#define TP_printk(fmt, args...) "print fmt: %s, %s\n", #fmt, __stringify(args)
+
+#undef TP_FORMAT
+#define TP_FORMAT(fmt, args...) "format: " fmt "\n", ##args

#undef TP_fast_assign
#define TP_fast_assign(args...) args
@@ -249,7 +257,7 @@ ftrace_format_##call(struct trace_seq *s) \
\
tstruct; \
\
- trace_seq_printf(s, "\nprint fmt: " print); \
+ trace_seq_printf(s, "\n" print); \
\
return ret; \
}
@@ -279,6 +287,13 @@ ftrace_format_##call(struct trace_seq *s) \
offsetof(typeof(field), __data_loc_##item), \
sizeof(field.__data_loc_##item), 0);

+#undef TP_printk
+#define TP_printk(fmt, args...)
+
+#undef TP_FORMAT
+#define TP_FORMAT(fmt, args...) \
+ ftrace_initialize_print(event_call, fmt, ##args)
+
#undef __string
#define __string(item, src) __dynamic_array(char, item, -1)

@@ -299,6 +314,8 @@ ftrace_define_fields_##call(void) \
\
tstruct; \
\
+ print; \
+ \
return ret; \
}

@@ -563,6 +580,7 @@ static int ftrace_raw_init_event_##call(void) \
event_##call.id = id; \
INIT_LIST_HEAD(&event_##call.fields); \
init_preds(&event_##call); \
+ INIT_LIST_HEAD(&event_##call.print_info); \
return 0; \
} \
\
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 06b8585..7c2ff68 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -51,5 +51,6 @@ obj-$(CONFIG_EVENT_TRACING) += trace_export.o
obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o
obj-$(CONFIG_EVENT_PROFILE) += trace_event_profile.o
obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
+obj-$(CONFIG_EVENT_TRACING) += trace_read_binary.o

libftrace-y := ftrace.o
diff --git a/kernel/trace/trace_read_binary.c b/kernel/trace/trace_read_binary.c
new file mode 100644
index 0000000..100d5c0
--- /dev/null
+++ b/kernel/trace/trace_read_binary.c
@@ -0,0 +1,674 @@
+/*
+ * trace_read_binary.c
+ *
+ * Copyright (C) 2009 Red Hat Inc, Steven Rostedt <[email protected]>
+ *
+ */
+
+#include <linux/ftrace_event.h>
+#include <linux/hardirq.h>
+#include <linux/module.h>
+#include <linux/ctype.h>
+
+#include "trace.h"
+
+static DEFINE_MUTEX(buffer_lock);
+static struct trace_seq buffer;
+
+/*
+ * Binary string parser. The print format uses a special language to explain
+ * the format to print the entry out. The language is as follows:
+ *
+ * FMT := constant string FMT | COMMAND FMT | empty
+ * COMMAND := <TYPE:FIELD> | <mask:FIELD:DELIM:MASKS> | <sym:FIELD:SYMBOLS> |
+ * <if:FIELD:TRUE:FALSE>
+ * TYPE := int | hex | ptr | string | strarray
+ * FIELD := defined by the event structure
+ * MASKS := MASK=NAME,MASKS | MASK=NAME
+ * MASK := the bit mask to match
+ * DELIM := delimiter to separate the fields. None and ':' are both allowed
+ * SYMBOLS := SYM=NAME,SYMBOLS | SYM=NAME
+ * SYM := the symbol value to test against
+ * TRUE := print when field is non zero
+ * FALSE := print when field is zero or NULL
+ * NAME := the name to write when a match is found
+ *
+ * A '\<' would print '<'
+ */
+
+#define TOK_SIZE 32
+
+enum field_types {
+ FIELD_IS_TEXT,
+ FIELD_IS_INT,
+ FIELD_IS_PTR,
+ FIELD_IS_LT,
+ FIELD_IS_IF,
+ FIELD_IS_STRING,
+ FIELD_IS_STRARRAY,
+ FIELD_IS_HEX,
+ FIELD_IS_MASK,
+ FIELD_IS_SYMBOL,
+};
+
+struct sym_mask {
+ struct list_head list;
+ unsigned long long val;
+ unsigned short start;
+ unsigned short len;
+};
+
+struct print_info {
+ struct list_head list;
+ enum field_types type;
+ union {
+ struct {
+ unsigned short start;
+ unsigned short len;
+ } text;
+ struct {
+ struct ftrace_event_field *field;
+ } data;
+ struct {
+ struct ftrace_event_field *field;
+ unsigned short true_text;
+ unsigned short true_len;
+ unsigned short false_text;
+ unsigned short false_len;
+ } cond;
+ struct {
+ struct ftrace_event_field *field;
+ struct list_head masks;
+ unsigned short delim;
+ unsigned short len;
+ } sym_mask;
+ };
+};
+
+static struct print_info *
+alloc_print_info(struct ftrace_event_call *call, enum field_types type)
+{
+ struct print_info *info;
+
+ info = kmalloc(sizeof(*info), GFP_KERNEL);
+ if (!info)
+ return NULL;
+
+ info->type = type;
+
+ list_add_tail(&info->list, &call->print_info);
+
+ return info;
+}
+
+static int
+add_text(struct ftrace_event_call *call, const char *start, const char *end)
+{
+ struct print_info *info;
+
+ info = alloc_print_info(call, FIELD_IS_TEXT);
+ if (!info)
+ return -ENOMEM;
+
+ info->text.start = start - call->print_text;
+ if (!end)
+ end = call->print_text + strlen(call->print_text);
+ info->text.len = end - start;
+
+ return 0;
+}
+
+static int
+add_less_than(struct ftrace_event_call *call, const char *start, const char *end)
+{
+ struct print_info *info;
+
+ info = alloc_print_info(call, FIELD_IS_LT);
+ if (!info)
+ return -ENOMEM;
+
+ info->text.start = start - call->print_text;
+ info->text.len = end - start;
+
+ return 0;
+}
+
+static int
+add_data(struct ftrace_event_call *call, enum field_types type,
+ struct ftrace_event_field *field)
+{
+ struct print_info *info;
+
+ info = alloc_print_info(call, type);
+ if (!info)
+ return -ENOMEM;
+
+ info->data.field = field;
+
+ return 0;
+}
+
+static int
+add_if(struct ftrace_event_call *call, struct ftrace_event_field *field,
+ const char *fmt, const char *end)
+{
+ struct print_info *info;
+ const char *tok;
+
+ info = alloc_print_info(call, FIELD_IS_IF);
+ if (!info)
+ return -ENOMEM;
+
+ info->cond.field = field;
+
+ tok = strchr(fmt, ':');
+ if (!tok || tok > end)
+ return -1;
+
+ info->cond.true_text = fmt - call->print_text;
+ info->cond.true_len = tok - fmt;
+
+ fmt = tok + 1;
+
+ info->cond.false_text = fmt - call->print_text;
+ info->cond.false_len = end - fmt;
+
+ return 0;
+}
+
+static int add_sym_mask(struct ftrace_event_call *call, struct list_head *list,
+ unsigned long long val,
+ const char *start, const char *end)
+{
+ struct sym_mask *sm;
+
+ sm = kmalloc(sizeof(*sm), GFP_KERNEL);
+ if (!sm)
+ return -ENOMEM;
+
+ list_add_tail(&sm->list, list);
+ sm->val = val;
+ sm->start = start - call->print_text;
+ sm->len = end - start;
+
+ return 0;
+}
+
+static const char *
+add_mask(struct ftrace_event_call *call, struct ftrace_event_field *field,
+ const char *delim, unsigned int delim_len,
+ const char *fmt, const char *end)
+{
+ struct print_info *info;
+ unsigned long long mask;
+ const char *tok;
+
+ info = alloc_print_info(call, FIELD_IS_MASK);
+ if (!info)
+ return end;
+
+ info->sym_mask.field = field;
+
+ INIT_LIST_HEAD(&info->sym_mask.masks);
+ info->sym_mask.len = delim_len;
+ if (delim_len)
+ info->sym_mask.delim = delim - call->print_text;
+
+ do {
+ while (isspace(*fmt))
+ fmt++;
+
+ tok = strchr(fmt, '=');
+ if (!tok || tok > end)
+ goto out_err;
+
+ mask = simple_strtoull(fmt, NULL, 0);
+ fmt = tok + 1;
+
+ tok = strchr(fmt, ',');
+ if (!tok || tok > end)
+ tok = end;
+
+ add_sym_mask(call, &info->sym_mask.masks, mask, fmt, tok);
+
+ fmt = tok + 1;
+ } while (fmt < end);
+
+ return end;
+ out_err:
+ WARN_ON(1);
+ printk("error in format '%s'\n", fmt);
+ return end;
+}
+
+static const char *
+add_symbol(struct ftrace_event_call *call, struct ftrace_event_field *field,
+ const char *fmt, const char *end)
+{
+ struct print_info *info;
+ unsigned long long sym;
+ const char *tok;
+
+ info = alloc_print_info(call, FIELD_IS_SYMBOL);
+ if (!info)
+ return end;
+
+ info->sym_mask.field = field;
+
+ INIT_LIST_HEAD(&info->sym_mask.masks);
+
+ do {
+ while (isspace(*fmt))
+ fmt++;
+
+ tok = strchr(fmt, '=');
+ if (!tok || tok > end)
+ goto out_err;
+
+ sym = simple_strtoull(fmt, NULL, 0);
+ fmt = tok + 1;
+
+ tok = strchr(fmt, ',');
+ if (!tok || tok > end)
+ tok = end;
+
+ add_sym_mask(call, &info->sym_mask.masks, sym, fmt, tok);
+
+ fmt = tok + 1;
+ } while (fmt < end);
+
+ return end;
+ out_err:
+ WARN_ON(1);
+ printk("error in format '%s'\n", fmt);
+ return end;
+}
+
+static struct ftrace_event_field *
+find_field(struct ftrace_event_call *call, const char *name, int len)
+{
+ struct ftrace_event_field *field;
+
+ list_for_each_entry(field, &call->fields, link) {
+ if (!strncmp(field->name, name, len))
+ return field;
+ }
+
+ return NULL;
+}
+
+const char *handle_field(struct ftrace_event_call *event,
+ const char *fmt, enum field_types field_type)
+{
+ struct ftrace_event_field *field;
+ const char *end, *tok, *delim;
+ unsigned int delim_len;
+
+ end = strchr(fmt, '>');
+ if (!end)
+ goto out_err;
+
+ switch (field_type) {
+ case FIELD_IS_INT:
+ case FIELD_IS_PTR:
+ case FIELD_IS_HEX:
+ case FIELD_IS_STRING:
+ case FIELD_IS_STRARRAY:
+ field = find_field(event, fmt, end - fmt);
+ if (!field)
+ goto out_err;
+
+ add_data(event, field_type, field);
+ break;
+
+ case FIELD_IS_IF:
+ tok = strchr(fmt, ':');
+ if (!tok || tok > end)
+ goto out_err;
+
+ field = find_field(event, fmt, tok - fmt);
+ if (!field)
+ goto out_err;
+
+ fmt = tok + 1;
+
+ add_if(event, field, fmt, end);
+ break;
+
+ case FIELD_IS_MASK:
+ case FIELD_IS_SYMBOL:
+ tok = strchr(fmt, ':');
+ if (!tok || tok > end)
+ goto out_err;
+
+ field = find_field(event, fmt, tok - fmt);
+ if (!field)
+ goto out_err;
+
+ fmt = tok + 1;
+
+ if (field_type == FIELD_IS_MASK) {
+ tok = strchr(fmt, ':');
+ if (!tok || tok > end)
+ goto out_err;
+
+ delim = fmt;
+ delim_len = tok - fmt;
+
+ /* we allow ':' as a delimiter */
+ if (!delim_len && tok[1] == ':') {
+ tok++;
+ delim_len++;
+ }
+
+ fmt = tok+1;
+
+ end = add_mask(event, field, delim, delim_len, fmt, end);
+ } else
+ end = add_symbol(event, field, fmt, end);
+
+ break;
+ default:
+ WARN_ON(1);
+ printk("unknown field\n");
+ }
+
+ end++;
+ return end;
+
+ out_err:
+ WARN_ON(1);
+ printk("error in format field: '%s'\n", fmt);
+ return NULL;
+}
+
+int
+ftrace_initialize_print(struct ftrace_event_call *event, const char *fmt, ...)
+{
+ const char *tok;
+ va_list ap;
+ int ret;
+
+ mutex_lock(&buffer_lock);
+ trace_seq_init(&buffer);
+
+ va_start(ap, fmt);
+ ret = trace_seq_vprintf(&buffer, fmt, ap);
+ va_end(ap);
+ if (!ret)
+ goto err_unlock;
+
+ ret = trace_seq_putc(&buffer, 0);
+ if (!ret)
+ goto err_unlock;
+
+ event->print_text = kstrdup(buffer.buffer, GFP_KERNEL);
+ if (!event->print_text)
+ goto err_unlock;
+
+ mutex_unlock(&buffer_lock);
+
+ fmt = event->print_text;
+
+ do {
+ enum field_types field_type;
+
+ tok = strchr(fmt, '<');
+ if (!tok) {
+ add_text(event, fmt, tok);
+ break;
+ }
+ if (*(tok - 1) == '\\') {
+ add_less_than(event, fmt, tok);
+ fmt = tok + 1;
+ continue;
+ }
+
+ add_text(event, fmt, tok);
+
+ fmt = tok + 1;
+
+ tok = strchr(fmt, ':');
+ if (!tok)
+ goto err_format;
+
+ if (strncmp(fmt, "int:", 4) == 0)
+ field_type = FIELD_IS_INT;
+
+ else if (strncmp(fmt, "ptr:", 4) == 0)
+ field_type = FIELD_IS_PTR;
+
+ else if (strncmp(fmt, "string:", 7) ==0)
+ field_type = FIELD_IS_STRING;
+
+ else if (strncmp(fmt, "hex:", 4) == 0)
+ field_type = FIELD_IS_HEX;
+
+ else if (strncmp(fmt, "if:", 3) == 0)
+ field_type = FIELD_IS_IF;
+
+ else if (strncmp(fmt, "mask:", 5) == 0)
+ field_type = FIELD_IS_MASK;
+
+ else if (strncmp(fmt, "sym:", 4) == 0)
+ field_type = FIELD_IS_SYMBOL;
+
+ else if (strncmp(fmt, "strarray:", 9) == 0)
+ field_type = FIELD_IS_STRARRAY;
+
+ else
+ goto err_format;
+
+ tok++;
+ fmt = handle_field(event, tok, field_type);
+
+ } while (fmt);
+
+ return 0;
+
+ err_unlock:
+ WARN_ON(1);
+ printk("Can not allocate event print format data\n");
+ mutex_unlock(&buffer_lock);
+ return -1;
+
+ err_format:
+ WARN_ON(1);
+ printk("error in format type: '%s'\n", fmt);
+ return -1;
+}
+EXPORT_SYMBOL_GPL(ftrace_initialize_print);
+
+
+static void
+trace_read_mask(struct trace_seq *s, unsigned long long val,
+ struct print_info *info, struct ftrace_event_call *event)
+{
+ unsigned long long mask;
+ struct sym_mask *sm;
+ int first = 1;
+
+ list_for_each_entry(sm, &info->sym_mask.masks, list) {
+ mask = sm->val;
+
+ if (first && !mask && !val) {
+ trace_seq_putmem(s, event->print_text + sm->start,
+ sm->len);
+ return;
+ }
+
+ if (mask && (mask & val) == mask) {
+ if (first)
+ first = 0;
+ else if (info->sym_mask.len)
+ trace_seq_putmem(s, event->print_text +
+ info->sym_mask.delim,
+ info->sym_mask.len);
+ val &= ~mask;
+
+ trace_seq_putmem(s, event->print_text + sm->start,
+ sm->len);
+ }
+ }
+
+ if (val)
+ trace_seq_printf(s, "(%llx)", val);
+
+ return;
+}
+
+static void
+trace_read_symbol(struct trace_seq *s, unsigned long long val,
+ struct print_info *info, struct ftrace_event_call *event)
+{
+ unsigned long long sym;
+ struct sym_mask *sm;
+ int found = 0;
+
+ list_for_each_entry(sm, &info->sym_mask.masks, list) {
+ sym = sm->val;
+
+ if (sym == val) {
+ found = 1;
+ trace_seq_putmem(s, event->print_text + sm->start,
+ sm->len);
+ break;
+ }
+ }
+
+ if (!found)
+ trace_seq_printf(s, "(%llx)", val);
+
+}
+
+char *
+ftrace_read_binary(struct trace_seq *s, struct ftrace_event_call *event,
+ struct trace_entry *entry)
+{
+ unsigned long long val, mask;
+ struct print_info *info;
+ char *start = s->buffer + s->len;
+ struct ftrace_event_field *field;
+ void *p;
+
+ list_for_each_entry(info, &event->print_info, list) {
+
+ p = entry;
+
+ switch (info->type) {
+ case FIELD_IS_LT:
+ case FIELD_IS_TEXT:
+ trace_seq_putmem(s, event->print_text + info->text.start,
+ info->text.len);
+ if (info->type == FIELD_IS_LT)
+ trace_seq_putc(s, '<');
+ break;
+ case FIELD_IS_INT:
+ case FIELD_IS_HEX:
+ case FIELD_IS_PTR:
+ field = info->data.field;
+ goto skip_if;
+
+ case FIELD_IS_IF:
+ field = info->cond.field;
+ skip_if:
+ p += field->offset;
+
+ switch (field->size) {
+ case 1:
+ val = *(char *)p;
+ mask = 0xffULL;
+ break;
+ case 2:
+ val = *(short *)p;
+ mask = 0xffffULL;
+ break;
+ case 4:
+ val = *(int *)p;
+ mask = 0xffffffffULL;
+ break;
+ case 8:
+ val = *(long long*)p;
+ mask = 0;
+ break;
+
+ default:
+ trace_seq_printf(s, "<error: bad field size %d?>\n",
+ field->size);
+ return start;
+ }
+
+ if (info->type == FIELD_IS_IF) {
+ if (val)
+ trace_seq_putmem(s, event->print_text +
+ info->cond.true_text,
+ info->cond.true_len);
+ else
+ trace_seq_putmem(s, event->print_text +
+ info->cond.false_text,
+ info->cond.false_len);
+ } else if (info->type == FIELD_IS_INT)
+ trace_seq_printf(s, "%lld", val);
+ else {
+ /* hex should only print the size specified */
+ if (mask)
+ val &= mask;
+
+ trace_seq_printf(s, "%llx", val);
+ }
+
+ break;
+
+ case FIELD_IS_STRING:
+ p += info->data.field->offset;
+ /* indexes are expected to be unsigned short */
+ WARN_ON(info->data.field->size != 2);
+ p = (void *)entry + *(unsigned short *)p;
+ trace_seq_puts(s, p);
+ break;
+
+ case FIELD_IS_STRARRAY:
+ p += info->data.field->offset;
+ trace_seq_puts(s, p);
+ break;
+
+ case FIELD_IS_MASK:
+ case FIELD_IS_SYMBOL:
+
+ p += info->sym_mask.field->offset;
+
+ switch (info->sym_mask.field->size) {
+ case 1:
+ val = *(unsigned char *)p;
+ break;
+ case 2:
+ val = *(unsigned short *)p;
+ break;
+ case 4:
+ val = *(unsigned int *)p;
+ break;
+ case 8:
+ val = *(unsigned long long*)p;
+ break;
+
+ default:
+ trace_seq_printf(s, "<error: bad field size %d?>\n",
+ info->sym_mask.field->size);
+ return start;
+ }
+
+ if (info->type == FIELD_IS_MASK)
+ trace_read_mask(s, val, info, event);
+ else
+ trace_read_symbol(s, val, info, event);
+ break;
+ default:
+ trace_seq_printf(s, "UNKNOWN TYPE %d\n", info->type);
+ }
+ }
+
+ trace_seq_putc(s, 0);
+
+ return start;
+}
+EXPORT_SYMBOL_GPL(ftrace_read_binary);
--
1.6.3.1

--


2009-06-09 19:22:20

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing

On Mon, Jun 08, 2009 at 09:45:36PM -0400, Steven Rostedt wrote:
> From: Steven Rostedt <[email protected]>
>
> The current print format that is exported to userspace is simply a
> copy of the printk format used to output the data. It would take a
> full C parser to parse it. But as more tools are made to read the
> binary data from ftrace, the larger the need for a nice parsing
> format to facilitate tools in reading the binary buffer.
>
> For example we currently have:
>
> irq_handler_entry:
> print fmt: "irq=%d handler=%s", REC->irq, (char *)((void *)REC + REC->__data_loc
> _name)
>
> softirq_entry:
> print fmt: "softirq=%d action=%s", REC->vec, ({ static const struct trace_print_
> flags symbols[] = { { HI_SOFTIRQ, "HI" }, { TIMER_SOFTIRQ, "TIMER" }, { NET_TX_S
> OFTIRQ, "NET_TX" }, { NET_RX_SOFTIRQ, "NET_RX" }, { BLOCK_SOFTIRQ, "BLOCK" }, {
> TASKLET_SOFTIRQ, "TASKLET" }, { SCHED_SOFTIRQ, "SCHED" }, { HRTIMER_SOFTIRQ, "HR
> TIMER" }, { RCU_SOFTIRQ, "RCU" }, { -1, ((void *)0) }}; ftrace_print_symbols_seq
> (p, REC->vec, symbols); })
>
> kmalloc:
> print fmt: "call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s", RE
> C->call_site, REC->ptr, REC->bytes_req, REC->bytes_alloc, (REC->gfp_flags) ? ({
> static const struct trace_print_flags flags[] = { {(unsigned long)(((gfp_t)0x10u
> ) | ((gfp_t)0x40u) | ((gfp_t)0x80u) | ((gfp_t)0x20000u) | ((gfp_t)0x02u) | ((gfp
> _t)0x100000u)), "GFP_HIGHUSER_MOVABLE"}, {(unsigned long)(((gfp_t)0x10u) | ((gfp
> _t)0x40u) | ((gfp_t)0x80u) | ((gfp_t)0x20000u) | ((gfp_t)0x02u)), "GFP_HIGHUSER"
> }, {(unsigned long)(((gfp_t)0x10u) | ((gfp_t)0x40u) | ((gfp_t)0x80u) | ((gfp_t)0
> x20000u)), "GFP_USER"}, {(unsigned long)(((gfp_t)0x10u) | ((gfp_t)0x40u) | ((gfp
> _t)0x80u) | ((gfp_t)0x80000u)), "GFP_TEMPORARY"}, {(unsigned long)(((gfp_t)0x10u
> ) | ((gfp_t)0x40u) | ((gfp_t)0x80u)), "GFP_KERNEL"}, {(unsigned long)(((gfp_t)0x
> 10u) | ((gfp_t)0x40u)), "GFP_NOFS"}, {(unsigned long)(((gfp_t)0x20u)), "GFP_ATOM
> IC"}, {(unsigned long)(((gfp_t)0x10u)), "GFP_NOIO"}, {(unsigned long)((gfp_t)0x2
> 0u), "GFP_HIGH"}, {(unsigned long)((gfp_t)0x10u), "GFP_WAIT"}, {(unsigned long)(
> (gfp_t)0x40u), "GFP_IO"}, {(unsigned long)((gfp_t)0x100u), "GFP_COLD"}, {(unsign
> ed long)((gfp_t)0x200u), "GFP_NOWARN"}, {(unsigned long)((gfp_t)0x400u), "GFP_RE
> PEAT"}, {(unsigned long)((gfp_t)0x800u), "GFP_NOFAIL"}, {(unsigned long)((gfp_t)
> 0x1000u), "GFP_NORETRY"}, {(unsigned long)((gfp_t)0x4000u), "GFP_COMP"}, {(unsig
> ned long)((gfp_t)0x8000u), "GFP_ZERO"}, {(unsigned long)((gfp_t)0x10000u), "GFP_
> NOMEMALLOC"}, {(unsigned long)((gfp_t)0x20000u), "GFP_HARDWALL"}, {(unsigned lon
> g)((gfp_t)0x40000u), "GFP_THISNODE"}, {(unsigned long)((gfp_t)0x80000u), "GFP_RE
> CLAIMABLE"}, {(unsigned long)((gfp_t)0x100000u), "GFP_MOVABLE"}, { -1, ((void *)
> 0) }}; ftrace_print_flags_seq(p, "|", REC->gfp_flags, flags); }) : "GFP_NOWAIT"



I agree with the fact that it must be simplified.




> The language that is added by this patch is of the following:
>
> * FMT := constant string FMT | COMMAND FMT | empty
> * COMMAND := <TYPE:FIELD> | <mask:FIELD:DELIM:MASKS> | <sym:FIELD:SYMBOLS> |
> * <if:FIELD:TRUE:FALSE>
> * TYPE := int | hex | ptr | string | strarray
> * FIELD := defined by the event structure
> * MASKS := MASK=NAME,MASKS | MASK=NAME
> * MASK := the bit mask to match
> * DELIM := delimiter to separate the fields. None and ':' are both allowed
> * SYMBOLS := SYM=NAME,SYMBOLS | SYM=NAME
> * SYM := the symbol value to test against
> * TRUE := print when field is non zero
> * FALSE := print when field is zero or NULL
> * NAME := the name to write when a match is found
> *
> * A '\<' would print '<'


But I wonder if the above new language is not breaking the charm
of the TRACE_EVENT(), which charm is that it's easy to implement (hopefully).

Everyone knows the printk formats. And I guess this new thing is easy and
quick to learn. But because it's a new unknown language, the TRACE_EVENT
will become less readable, less reachable for newcomers in TRACE_EVENT.

I don't know...

Frederic.


> The above examples would then look like:
>
> irq_handler_entry:
> format: irq=<int:irq> handler=<string:name>
>
> softirq_entry:
> format: softirq=<int:vec> action=<sym:vec:0=HI,1=TIMER,2=NET_TX,3=NET_RX,4=BLOCK,5=TASKLET,6=SCHED,7=HRTIMER,8=RCU
>
> kmalloc:
> format: call_site=<hex:call_site> ptr=<ptr:ptr> bytes_req=<int:bytes_req> bytes_alloc=<int:bytes_alloc> gfp_flags=<mask:gfp_flags:|:0=GFP_NOWAIT,0x1200d2=GFP_HIGHUSER_MOVABLE,0x200d2=GFP_HIGHUSER,0x200d0=GFP_USER,0x800d0=GFP_TEMPORARY,0xd0=GFP_KERNEL,0x50=GFP_NOFS,0x20=GFP_ATOMIC,0x10=GFP_NOIO,0x20=GFP_HIGH,0x10=GFP_WAIT,0x40=GFP_IO,0x100=GFP_COLD,0x200=GFP_NOWARN,0x400=GFP_REPEAT,0x800=GFP_NOFAIL,0x1000=GFP_NORETRY,0x4000=GFP_COMP,0x8000=GFP_ZERO,0x10000=GFP_NOMEMALLOC,0x20000=GFP_HARDWALL,0x40000=GFP_THISNODE,0x80000=GFP_RECLAIMABLE,0x100000=GFP_MOVABLE
>
> The above "mask" command takes '0' as a special mask that should be done only in the beginning. It will write that symbol out when the mask is zero.
>
> Another nice thing about this change set is that it can live together with
> the current print format. The old version will show "print fmt:" in
> the output file, where as the new version will use "format:" as shown
> in the above examples.
>
> Both may be used, but it would be preferable to use the new language.
> If the new language is not adequate for a new trace point we can always
> add new types. Userspace tools should just ignore types it does not
> understand.
>
> Signed-off-by: Steven Rostedt <[email protected]>
> ---
> include/linux/ftrace_event.h | 10 +
> include/trace/ftrace.h | 22 ++-
> kernel/trace/Makefile | 1 +
> kernel/trace/trace_read_binary.c | 674 ++++++++++++++++++++++++++++++++++++++
> 4 files changed, 705 insertions(+), 2 deletions(-)
> create mode 100644 kernel/trace/trace_read_binary.c
>
> diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> index 5c093ff..f1b59d3 100644
> --- a/include/linux/ftrace_event.h
> +++ b/include/linux/ftrace_event.h
> @@ -119,6 +119,9 @@ struct ftrace_event_call {
> void *filter;
> void *mod;
>
> + struct list_head print_info;
> + const char *print_text;
> +
> #ifdef CONFIG_EVENT_PROFILE
> atomic_t profile_count;
> int (*profile_enable)(struct ftrace_event_call *);
> @@ -136,6 +139,13 @@ extern int filter_current_check_discard(struct ftrace_event_call *call,
> void *rec,
> struct ring_buffer_event *event);
>
> +extern char *ftrace_read_binary(struct trace_seq *p,
> + struct ftrace_event_call *event,
> + struct trace_entry *entry);
> +extern int ftrace_initialize_print(struct ftrace_event_call *event,
> + const char *fmt, ...)
> + __attribute__ ((format (printf, 2, 3)));
> +
> extern int trace_define_field(struct ftrace_event_call *call, char *type,
> char *name, int offset, int size, int is_signed);
>
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index 40ede4d..e3370c5 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -124,6 +124,10 @@
> #undef TP_printk
> #define TP_printk(fmt, args...) fmt "\n", args
>
> +#undef TP_FORMAT
> +#define TP_FORMAT(fmt, args...) \
> + "%s\n", ftrace_read_binary(p, event_call, entry)
> +
> #undef __get_dynamic_array
> #define __get_dynamic_array(field) \
> ((void *)__entry + __entry->__data_loc_##field)
> @@ -152,6 +156,7 @@
> enum print_line_t \
> ftrace_raw_output_##call(struct trace_iterator *iter, int flags) \
> { \
> + struct ftrace_event_call *event_call __maybe_unused = &event_##call; \
> struct trace_seq *s = &iter->seq; \
> struct ftrace_raw_##call *field; \
> struct trace_entry *entry; \
> @@ -234,7 +239,10 @@ ftrace_raw_output_##call(struct trace_iterator *iter, int flags) \
> #define __entry REC
>
> #undef TP_printk
> -#define TP_printk(fmt, args...) "%s, %s\n", #fmt, __stringify(args)
> +#define TP_printk(fmt, args...) "print fmt: %s, %s\n", #fmt, __stringify(args)
> +
> +#undef TP_FORMAT
> +#define TP_FORMAT(fmt, args...) "format: " fmt "\n", ##args
>
> #undef TP_fast_assign
> #define TP_fast_assign(args...) args
> @@ -249,7 +257,7 @@ ftrace_format_##call(struct trace_seq *s) \
> \
> tstruct; \
> \
> - trace_seq_printf(s, "\nprint fmt: " print); \
> + trace_seq_printf(s, "\n" print); \
> \
> return ret; \
> }
> @@ -279,6 +287,13 @@ ftrace_format_##call(struct trace_seq *s) \
> offsetof(typeof(field), __data_loc_##item), \
> sizeof(field.__data_loc_##item), 0);
>
> +#undef TP_printk
> +#define TP_printk(fmt, args...)
> +
> +#undef TP_FORMAT
> +#define TP_FORMAT(fmt, args...) \
> + ftrace_initialize_print(event_call, fmt, ##args)
> +
> #undef __string
> #define __string(item, src) __dynamic_array(char, item, -1)
>
> @@ -299,6 +314,8 @@ ftrace_define_fields_##call(void) \
> \
> tstruct; \
> \
> + print; \
> + \
> return ret; \
> }
>
> @@ -563,6 +580,7 @@ static int ftrace_raw_init_event_##call(void) \
> event_##call.id = id; \
> INIT_LIST_HEAD(&event_##call.fields); \
> init_preds(&event_##call); \
> + INIT_LIST_HEAD(&event_##call.print_info); \
> return 0; \
> } \
> \
> diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
> index 06b8585..7c2ff68 100644
> --- a/kernel/trace/Makefile
> +++ b/kernel/trace/Makefile
> @@ -51,5 +51,6 @@ obj-$(CONFIG_EVENT_TRACING) += trace_export.o
> obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o
> obj-$(CONFIG_EVENT_PROFILE) += trace_event_profile.o
> obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
> +obj-$(CONFIG_EVENT_TRACING) += trace_read_binary.o
>
> libftrace-y := ftrace.o
> diff --git a/kernel/trace/trace_read_binary.c b/kernel/trace/trace_read_binary.c
> new file mode 100644
> index 0000000..100d5c0
> --- /dev/null
> +++ b/kernel/trace/trace_read_binary.c
> @@ -0,0 +1,674 @@
> +/*
> + * trace_read_binary.c
> + *
> + * Copyright (C) 2009 Red Hat Inc, Steven Rostedt <[email protected]>
> + *
> + */
> +
> +#include <linux/ftrace_event.h>
> +#include <linux/hardirq.h>
> +#include <linux/module.h>
> +#include <linux/ctype.h>
> +
> +#include "trace.h"
> +
> +static DEFINE_MUTEX(buffer_lock);
> +static struct trace_seq buffer;
> +
> +/*
> + * Binary string parser. The print format uses a special language to explain
> + * the format to print the entry out. The language is as follows:
> + *
> + * FMT := constant string FMT | COMMAND FMT | empty
> + * COMMAND := <TYPE:FIELD> | <mask:FIELD:DELIM:MASKS> | <sym:FIELD:SYMBOLS> |
> + * <if:FIELD:TRUE:FALSE>
> + * TYPE := int | hex | ptr | string | strarray
> + * FIELD := defined by the event structure
> + * MASKS := MASK=NAME,MASKS | MASK=NAME
> + * MASK := the bit mask to match
> + * DELIM := delimiter to separate the fields. None and ':' are both allowed
> + * SYMBOLS := SYM=NAME,SYMBOLS | SYM=NAME
> + * SYM := the symbol value to test against
> + * TRUE := print when field is non zero
> + * FALSE := print when field is zero or NULL
> + * NAME := the name to write when a match is found
> + *
> + * A '\<' would print '<'
> + */
> +
> +#define TOK_SIZE 32
> +
> +enum field_types {
> + FIELD_IS_TEXT,
> + FIELD_IS_INT,
> + FIELD_IS_PTR,
> + FIELD_IS_LT,
> + FIELD_IS_IF,
> + FIELD_IS_STRING,
> + FIELD_IS_STRARRAY,
> + FIELD_IS_HEX,
> + FIELD_IS_MASK,
> + FIELD_IS_SYMBOL,
> +};
> +
> +struct sym_mask {
> + struct list_head list;
> + unsigned long long val;
> + unsigned short start;
> + unsigned short len;
> +};
> +
> +struct print_info {
> + struct list_head list;
> + enum field_types type;
> + union {
> + struct {
> + unsigned short start;
> + unsigned short len;
> + } text;
> + struct {
> + struct ftrace_event_field *field;
> + } data;
> + struct {
> + struct ftrace_event_field *field;
> + unsigned short true_text;
> + unsigned short true_len;
> + unsigned short false_text;
> + unsigned short false_len;
> + } cond;
> + struct {
> + struct ftrace_event_field *field;
> + struct list_head masks;
> + unsigned short delim;
> + unsigned short len;
> + } sym_mask;
> + };
> +};
> +
> +static struct print_info *
> +alloc_print_info(struct ftrace_event_call *call, enum field_types type)
> +{
> + struct print_info *info;
> +
> + info = kmalloc(sizeof(*info), GFP_KERNEL);
> + if (!info)
> + return NULL;
> +
> + info->type = type;
> +
> + list_add_tail(&info->list, &call->print_info);
> +
> + return info;
> +}
> +
> +static int
> +add_text(struct ftrace_event_call *call, const char *start, const char *end)
> +{
> + struct print_info *info;
> +
> + info = alloc_print_info(call, FIELD_IS_TEXT);
> + if (!info)
> + return -ENOMEM;
> +
> + info->text.start = start - call->print_text;
> + if (!end)
> + end = call->print_text + strlen(call->print_text);
> + info->text.len = end - start;
> +
> + return 0;
> +}
> +
> +static int
> +add_less_than(struct ftrace_event_call *call, const char *start, const char *end)
> +{
> + struct print_info *info;
> +
> + info = alloc_print_info(call, FIELD_IS_LT);
> + if (!info)
> + return -ENOMEM;
> +
> + info->text.start = start - call->print_text;
> + info->text.len = end - start;
> +
> + return 0;
> +}
> +
> +static int
> +add_data(struct ftrace_event_call *call, enum field_types type,
> + struct ftrace_event_field *field)
> +{
> + struct print_info *info;
> +
> + info = alloc_print_info(call, type);
> + if (!info)
> + return -ENOMEM;
> +
> + info->data.field = field;
> +
> + return 0;
> +}
> +
> +static int
> +add_if(struct ftrace_event_call *call, struct ftrace_event_field *field,
> + const char *fmt, const char *end)
> +{
> + struct print_info *info;
> + const char *tok;
> +
> + info = alloc_print_info(call, FIELD_IS_IF);
> + if (!info)
> + return -ENOMEM;
> +
> + info->cond.field = field;
> +
> + tok = strchr(fmt, ':');
> + if (!tok || tok > end)
> + return -1;
> +
> + info->cond.true_text = fmt - call->print_text;
> + info->cond.true_len = tok - fmt;
> +
> + fmt = tok + 1;
> +
> + info->cond.false_text = fmt - call->print_text;
> + info->cond.false_len = end - fmt;
> +
> + return 0;
> +}
> +
> +static int add_sym_mask(struct ftrace_event_call *call, struct list_head *list,
> + unsigned long long val,
> + const char *start, const char *end)
> +{
> + struct sym_mask *sm;
> +
> + sm = kmalloc(sizeof(*sm), GFP_KERNEL);
> + if (!sm)
> + return -ENOMEM;
> +
> + list_add_tail(&sm->list, list);
> + sm->val = val;
> + sm->start = start - call->print_text;
> + sm->len = end - start;
> +
> + return 0;
> +}
> +
> +static const char *
> +add_mask(struct ftrace_event_call *call, struct ftrace_event_field *field,
> + const char *delim, unsigned int delim_len,
> + const char *fmt, const char *end)
> +{
> + struct print_info *info;
> + unsigned long long mask;
> + const char *tok;
> +
> + info = alloc_print_info(call, FIELD_IS_MASK);
> + if (!info)
> + return end;
> +
> + info->sym_mask.field = field;
> +
> + INIT_LIST_HEAD(&info->sym_mask.masks);
> + info->sym_mask.len = delim_len;
> + if (delim_len)
> + info->sym_mask.delim = delim - call->print_text;
> +
> + do {
> + while (isspace(*fmt))
> + fmt++;
> +
> + tok = strchr(fmt, '=');
> + if (!tok || tok > end)
> + goto out_err;
> +
> + mask = simple_strtoull(fmt, NULL, 0);
> + fmt = tok + 1;
> +
> + tok = strchr(fmt, ',');
> + if (!tok || tok > end)
> + tok = end;
> +
> + add_sym_mask(call, &info->sym_mask.masks, mask, fmt, tok);
> +
> + fmt = tok + 1;
> + } while (fmt < end);
> +
> + return end;
> + out_err:
> + WARN_ON(1);
> + printk("error in format '%s'\n", fmt);
> + return end;
> +}
> +
> +static const char *
> +add_symbol(struct ftrace_event_call *call, struct ftrace_event_field *field,
> + const char *fmt, const char *end)
> +{
> + struct print_info *info;
> + unsigned long long sym;
> + const char *tok;
> +
> + info = alloc_print_info(call, FIELD_IS_SYMBOL);
> + if (!info)
> + return end;
> +
> + info->sym_mask.field = field;
> +
> + INIT_LIST_HEAD(&info->sym_mask.masks);
> +
> + do {
> + while (isspace(*fmt))
> + fmt++;
> +
> + tok = strchr(fmt, '=');
> + if (!tok || tok > end)
> + goto out_err;
> +
> + sym = simple_strtoull(fmt, NULL, 0);
> + fmt = tok + 1;
> +
> + tok = strchr(fmt, ',');
> + if (!tok || tok > end)
> + tok = end;
> +
> + add_sym_mask(call, &info->sym_mask.masks, sym, fmt, tok);
> +
> + fmt = tok + 1;
> + } while (fmt < end);
> +
> + return end;
> + out_err:
> + WARN_ON(1);
> + printk("error in format '%s'\n", fmt);
> + return end;
> +}
> +
> +static struct ftrace_event_field *
> +find_field(struct ftrace_event_call *call, const char *name, int len)
> +{
> + struct ftrace_event_field *field;
> +
> + list_for_each_entry(field, &call->fields, link) {
> + if (!strncmp(field->name, name, len))
> + return field;
> + }
> +
> + return NULL;
> +}
> +
> +const char *handle_field(struct ftrace_event_call *event,
> + const char *fmt, enum field_types field_type)
> +{
> + struct ftrace_event_field *field;
> + const char *end, *tok, *delim;
> + unsigned int delim_len;
> +
> + end = strchr(fmt, '>');
> + if (!end)
> + goto out_err;
> +
> + switch (field_type) {
> + case FIELD_IS_INT:
> + case FIELD_IS_PTR:
> + case FIELD_IS_HEX:
> + case FIELD_IS_STRING:
> + case FIELD_IS_STRARRAY:
> + field = find_field(event, fmt, end - fmt);
> + if (!field)
> + goto out_err;
> +
> + add_data(event, field_type, field);
> + break;
> +
> + case FIELD_IS_IF:
> + tok = strchr(fmt, ':');
> + if (!tok || tok > end)
> + goto out_err;
> +
> + field = find_field(event, fmt, tok - fmt);
> + if (!field)
> + goto out_err;
> +
> + fmt = tok + 1;
> +
> + add_if(event, field, fmt, end);
> + break;
> +
> + case FIELD_IS_MASK:
> + case FIELD_IS_SYMBOL:
> + tok = strchr(fmt, ':');
> + if (!tok || tok > end)
> + goto out_err;
> +
> + field = find_field(event, fmt, tok - fmt);
> + if (!field)
> + goto out_err;
> +
> + fmt = tok + 1;
> +
> + if (field_type == FIELD_IS_MASK) {
> + tok = strchr(fmt, ':');
> + if (!tok || tok > end)
> + goto out_err;
> +
> + delim = fmt;
> + delim_len = tok - fmt;
> +
> + /* we allow ':' as a delimiter */
> + if (!delim_len && tok[1] == ':') {
> + tok++;
> + delim_len++;
> + }
> +
> + fmt = tok+1;
> +
> + end = add_mask(event, field, delim, delim_len, fmt, end);
> + } else
> + end = add_symbol(event, field, fmt, end);
> +
> + break;
> + default:
> + WARN_ON(1);
> + printk("unknown field\n");
> + }
> +
> + end++;
> + return end;
> +
> + out_err:
> + WARN_ON(1);
> + printk("error in format field: '%s'\n", fmt);
> + return NULL;
> +}
> +
> +int
> +ftrace_initialize_print(struct ftrace_event_call *event, const char *fmt, ...)
> +{
> + const char *tok;
> + va_list ap;
> + int ret;
> +
> + mutex_lock(&buffer_lock);
> + trace_seq_init(&buffer);
> +
> + va_start(ap, fmt);
> + ret = trace_seq_vprintf(&buffer, fmt, ap);
> + va_end(ap);
> + if (!ret)
> + goto err_unlock;
> +
> + ret = trace_seq_putc(&buffer, 0);
> + if (!ret)
> + goto err_unlock;
> +
> + event->print_text = kstrdup(buffer.buffer, GFP_KERNEL);
> + if (!event->print_text)
> + goto err_unlock;
> +
> + mutex_unlock(&buffer_lock);
> +
> + fmt = event->print_text;
> +
> + do {
> + enum field_types field_type;
> +
> + tok = strchr(fmt, '<');
> + if (!tok) {
> + add_text(event, fmt, tok);
> + break;
> + }
> + if (*(tok - 1) == '\\') {
> + add_less_than(event, fmt, tok);
> + fmt = tok + 1;
> + continue;
> + }
> +
> + add_text(event, fmt, tok);
> +
> + fmt = tok + 1;
> +
> + tok = strchr(fmt, ':');
> + if (!tok)
> + goto err_format;
> +
> + if (strncmp(fmt, "int:", 4) == 0)
> + field_type = FIELD_IS_INT;
> +
> + else if (strncmp(fmt, "ptr:", 4) == 0)
> + field_type = FIELD_IS_PTR;
> +
> + else if (strncmp(fmt, "string:", 7) ==0)
> + field_type = FIELD_IS_STRING;
> +
> + else if (strncmp(fmt, "hex:", 4) == 0)
> + field_type = FIELD_IS_HEX;
> +
> + else if (strncmp(fmt, "if:", 3) == 0)
> + field_type = FIELD_IS_IF;
> +
> + else if (strncmp(fmt, "mask:", 5) == 0)
> + field_type = FIELD_IS_MASK;
> +
> + else if (strncmp(fmt, "sym:", 4) == 0)
> + field_type = FIELD_IS_SYMBOL;
> +
> + else if (strncmp(fmt, "strarray:", 9) == 0)
> + field_type = FIELD_IS_STRARRAY;
> +
> + else
> + goto err_format;
> +
> + tok++;
> + fmt = handle_field(event, tok, field_type);
> +
> + } while (fmt);
> +
> + return 0;
> +
> + err_unlock:
> + WARN_ON(1);
> + printk("Can not allocate event print format data\n");
> + mutex_unlock(&buffer_lock);
> + return -1;
> +
> + err_format:
> + WARN_ON(1);
> + printk("error in format type: '%s'\n", fmt);
> + return -1;
> +}
> +EXPORT_SYMBOL_GPL(ftrace_initialize_print);
> +
> +
> +static void
> +trace_read_mask(struct trace_seq *s, unsigned long long val,
> + struct print_info *info, struct ftrace_event_call *event)
> +{
> + unsigned long long mask;
> + struct sym_mask *sm;
> + int first = 1;
> +
> + list_for_each_entry(sm, &info->sym_mask.masks, list) {
> + mask = sm->val;
> +
> + if (first && !mask && !val) {
> + trace_seq_putmem(s, event->print_text + sm->start,
> + sm->len);
> + return;
> + }
> +
> + if (mask && (mask & val) == mask) {
> + if (first)
> + first = 0;
> + else if (info->sym_mask.len)
> + trace_seq_putmem(s, event->print_text +
> + info->sym_mask.delim,
> + info->sym_mask.len);
> + val &= ~mask;
> +
> + trace_seq_putmem(s, event->print_text + sm->start,
> + sm->len);
> + }
> + }
> +
> + if (val)
> + trace_seq_printf(s, "(%llx)", val);
> +
> + return;
> +}
> +
> +static void
> +trace_read_symbol(struct trace_seq *s, unsigned long long val,
> + struct print_info *info, struct ftrace_event_call *event)
> +{
> + unsigned long long sym;
> + struct sym_mask *sm;
> + int found = 0;
> +
> + list_for_each_entry(sm, &info->sym_mask.masks, list) {
> + sym = sm->val;
> +
> + if (sym == val) {
> + found = 1;
> + trace_seq_putmem(s, event->print_text + sm->start,
> + sm->len);
> + break;
> + }
> + }
> +
> + if (!found)
> + trace_seq_printf(s, "(%llx)", val);
> +
> +}
> +
> +char *
> +ftrace_read_binary(struct trace_seq *s, struct ftrace_event_call *event,
> + struct trace_entry *entry)
> +{
> + unsigned long long val, mask;
> + struct print_info *info;
> + char *start = s->buffer + s->len;
> + struct ftrace_event_field *field;
> + void *p;
> +
> + list_for_each_entry(info, &event->print_info, list) {
> +
> + p = entry;
> +
> + switch (info->type) {
> + case FIELD_IS_LT:
> + case FIELD_IS_TEXT:
> + trace_seq_putmem(s, event->print_text + info->text.start,
> + info->text.len);
> + if (info->type == FIELD_IS_LT)
> + trace_seq_putc(s, '<');
> + break;
> + case FIELD_IS_INT:
> + case FIELD_IS_HEX:
> + case FIELD_IS_PTR:
> + field = info->data.field;
> + goto skip_if;
> +
> + case FIELD_IS_IF:
> + field = info->cond.field;
> + skip_if:
> + p += field->offset;
> +
> + switch (field->size) {
> + case 1:
> + val = *(char *)p;
> + mask = 0xffULL;
> + break;
> + case 2:
> + val = *(short *)p;
> + mask = 0xffffULL;
> + break;
> + case 4:
> + val = *(int *)p;
> + mask = 0xffffffffULL;
> + break;
> + case 8:
> + val = *(long long*)p;
> + mask = 0;
> + break;
> +
> + default:
> + trace_seq_printf(s, "<error: bad field size %d?>\n",
> + field->size);
> + return start;
> + }
> +
> + if (info->type == FIELD_IS_IF) {
> + if (val)
> + trace_seq_putmem(s, event->print_text +
> + info->cond.true_text,
> + info->cond.true_len);
> + else
> + trace_seq_putmem(s, event->print_text +
> + info->cond.false_text,
> + info->cond.false_len);
> + } else if (info->type == FIELD_IS_INT)
> + trace_seq_printf(s, "%lld", val);
> + else {
> + /* hex should only print the size specified */
> + if (mask)
> + val &= mask;
> +
> + trace_seq_printf(s, "%llx", val);
> + }
> +
> + break;
> +
> + case FIELD_IS_STRING:
> + p += info->data.field->offset;
> + /* indexes are expected to be unsigned short */
> + WARN_ON(info->data.field->size != 2);
> + p = (void *)entry + *(unsigned short *)p;
> + trace_seq_puts(s, p);
> + break;
> +
> + case FIELD_IS_STRARRAY:
> + p += info->data.field->offset;
> + trace_seq_puts(s, p);
> + break;
> +
> + case FIELD_IS_MASK:
> + case FIELD_IS_SYMBOL:
> +
> + p += info->sym_mask.field->offset;
> +
> + switch (info->sym_mask.field->size) {
> + case 1:
> + val = *(unsigned char *)p;
> + break;
> + case 2:
> + val = *(unsigned short *)p;
> + break;
> + case 4:
> + val = *(unsigned int *)p;
> + break;
> + case 8:
> + val = *(unsigned long long*)p;
> + break;
> +
> + default:
> + trace_seq_printf(s, "<error: bad field size %d?>\n",
> + info->sym_mask.field->size);
> + return start;
> + }
> +
> + if (info->type == FIELD_IS_MASK)
> + trace_read_mask(s, val, info, event);
> + else
> + trace_read_symbol(s, val, info, event);
> + break;
> + default:
> + trace_seq_printf(s, "UNKNOWN TYPE %d\n", info->type);
> + }
> + }
> +
> + trace_seq_putc(s, 0);
> +
> + return start;
> +}
> +EXPORT_SYMBOL_GPL(ftrace_read_binary);
> --
> 1.6.3.1
>
> --

2009-06-09 19:45:51

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing


On Tue, 9 Jun 2009, Frederic Weisbecker wrote:
>
>
> > The language that is added by this patch is of the following:
> >
> > * FMT := constant string FMT | COMMAND FMT | empty
> > * COMMAND := <TYPE:FIELD> | <mask:FIELD:DELIM:MASKS> | <sym:FIELD:SYMBOLS> |
> > * <if:FIELD:TRUE:FALSE>
> > * TYPE := int | hex | ptr | string | strarray
> > * FIELD := defined by the event structure
> > * MASKS := MASK=NAME,MASKS | MASK=NAME
> > * MASK := the bit mask to match
> > * DELIM := delimiter to separate the fields. None and ':' are both allowed
> > * SYMBOLS := SYM=NAME,SYMBOLS | SYM=NAME
> > * SYM := the symbol value to test against
> > * TRUE := print when field is non zero
> > * FALSE := print when field is zero or NULL
> > * NAME := the name to write when a match is found
> > *
> > * A '\<' would print '<'
>
>
> But I wonder if the above new language is not breaking the charm
> of the TRACE_EVENT(), which charm is that it's easy to implement (hopefully).
>
> Everyone knows the printk formats. And I guess this new thing is easy and
> quick to learn. But because it's a new unknown language, the TRACE_EVENT
> will become less readable, less reachable for newcomers in TRACE_EVENT.

I tried to avoid this too, but when I started writing a tool in C that
would parse the format file, I found that the printf was horrible.

Note, I will still keep the TP_printk() macro, that will not change. The
new macro is TP_FORMAT() that preforms the tags. Thus, if you really want
it to print out, you can use TP_printk, but the user space tools that read
the binary will not know how to read it unless the printk is simple.

I really want to keep this language simple. If anyone has any better
ideas, I'm all for it.

I barfed when I saw this in irq_handler_entry:

print fmt: "irq=%d handler=%s", REC->irq, (char *)((void *)REC + REC->__data_loc_name)

That we probably can get to show something like:

print fmt: "irq=%d handler=%s", REC->irq, __get_string(REC->name)

I plan on also adding some type tags. Perhaps adding a GFP that would do
the flags for kmem, and have a way to export the values in another file.

But for now, I would like to add the tags "major" and "minor" that would
get the major and minor from a device:

<major:dev> <minor:dev>

And also one that can handle time:

<nsec2usec:nsec> that would convert a ns time to the "%lu.%03lu" format,
or even a <nsec2sec:n:nsec> that would convert to "%lu.%0" n "lu" format.

Again, anyone have any better ideas?

-- Steve


2009-06-09 20:58:11

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing

* Steven Rostedt ([email protected]) wrote:

> Again, anyone have any better ideas?
>
> -- Steve
>

Hi Steven,

sorry I don't have much time to look into this. My ph.d. thesis is due
for july 5th..

I just wanted to let you now that I wrote a very simple XML parser in
the early days of LTTng. I dumped it in favor of the trace_mark() format
strings, but it really gave much more flexibility to describe events,
including nested structures, unions... etc.

It's written in C with almost no external dependency. The goal was to
make it so it could be acceptable in the kernel tree. I think you might
want to have a quick look at it. It generated the probe callbacks which
were, at the time, inlines. Since then I realised that inlining the
tracer code in the kernel call site pollutes the i-cache, so it's not
the way I would deal with code generation today, but those are minor
details.

It would need a cleanup. I did not write the original version of this
code, but rather kicked it until it worked as I wanted it. It worked
with the old LTTng versions:

http://www.lttng.org/files/packages/genevent-0.35.tar.gz

The package contains sample XML files.

Good luck,

Mathieu


--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-06-10 01:58:11

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing

Steven Rostedt wrote:
> On Tue, 9 Jun 2009, Frederic Weisbecker wrote:
>>
>>> The language that is added by this patch is of the following:
>>>
>>> * FMT := constant string FMT | COMMAND FMT | empty
>>> * COMMAND := <TYPE:FIELD> | <mask:FIELD:DELIM:MASKS> | <sym:FIELD:SYMBOLS> |
>>> * <if:FIELD:TRUE:FALSE>
>>> * TYPE := int | hex | ptr | string | strarray
>>> * FIELD := defined by the event structure
>>> * MASKS := MASK=NAME,MASKS | MASK=NAME
>>> * MASK := the bit mask to match
>>> * DELIM := delimiter to separate the fields. None and ':' are both allowed
>>> * SYMBOLS := SYM=NAME,SYMBOLS | SYM=NAME
>>> * SYM := the symbol value to test against
>>> * TRUE := print when field is non zero
>>> * FALSE := print when field is zero or NULL
>>> * NAME := the name to write when a match is found
>>> *
>>> * A '\<' would print '<'
>>
>> But I wonder if the above new language is not breaking the charm
>> of the TRACE_EVENT(), which charm is that it's easy to implement (hopefully).
>>
>> Everyone knows the printk formats. And I guess this new thing is easy and
>> quick to learn. But because it's a new unknown language, the TRACE_EVENT
>> will become less readable, less reachable for newcomers in TRACE_EVENT.
>
> I tried to avoid this too, but when I started writing a tool in C that
> would parse the format file, I found that the printf was horrible.
>
> Note, I will still keep the TP_printk() macro, that will not change. The
> new macro is TP_FORMAT() that preforms the tags. Thus, if you really want
> it to print out, you can use TP_printk, but the user space tools that read
> the binary will not know how to read it unless the printk is simple.
>
> I really want to keep this language simple. If anyone has any better
> ideas, I'm all for it.
>
> I barfed when I saw this in irq_handler_entry:
>
> print fmt: "irq=%d handler=%s", REC->irq, (char *)((void *)REC + REC->__data_loc_name)
>

I barfed too when I saw this. Introducing a new format is
very meaningful. And this print format looks well.
Good Work!

I'm trying to write user-space tools. Could you tell me your tools'
developing status? And the aim of these tools? Will it be public?

I'm wandering whether I should stop this trying.

Lai.

2009-06-10 05:37:23

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing


On Wed, 10 Jun 2009, Lai Jiangshan wrote:
>
> I'm trying to write user-space tools. Could you tell me your tools'
> developing status? And the aim of these tools? Will it be public?

I'm toying with a command line parser, and something to hook with Ingo's
perf.

>
> I'm wandering whether I should stop this trying.

Well, more ideas the better, anything that is done can probably be used
;-)

-- Steve

2009-06-10 09:37:20

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing

On Tue, Jun 09, 2009 at 03:45:36PM -0400, Steven Rostedt wrote:
> Note, I will still keep the TP_printk() macro, that will not change. The
> new macro is TP_FORMAT() that preforms the tags. Thus, if you really want
> it to print out, you can use TP_printk, but the user space tools that read
> the binary will not know how to read it unless the printk is simple.

Keeping the two in parallel for more than a short migration period
will be a nightmare. We should have one way to express all our tracers,
and not keep on adding special cases that userland has to deal with.

2009-06-10 09:48:30

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing

On Tue, Jun 09, 2009 at 09:22:01PM +0200, Frederic Weisbecker wrote:
> But I wonder if the above new language is not breaking the charm
> of the TRACE_EVENT(), which charm is that it's easy to implement (hopefully).
>
> Everyone knows the printk formats. And I guess this new thing is easy and
> quick to learn. But because it's a new unknown language, the TRACE_EVENT
> will become less readable, less reachable for newcomers in TRACE_EVENT.

I must also say I don't particularly like it. printk is nice and easy
an everybody knows it, but it's not quite flexible enough as we might
have to do all kinds of conversions on the reader side. What might be
a better idea is to just have C function pointer for output conversions
that could be put into the a file in debugfs and used by the binary
trace buffer reader. Or maybe not as we would pull in too many
depenencies.

I think we should go with the printk solution for 2.6.31 and use the
full development cycle for 2.6.32 to come up with something better.

As soon as a couple of large subsystems use the even tracer we also
have a broader base examples to see how new syntax works on them.

2009-06-10 10:12:21

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing


* Christoph Hellwig <[email protected]> wrote:

> On Tue, Jun 09, 2009 at 09:22:01PM +0200, Frederic Weisbecker wrote:
> > But I wonder if the above new language is not breaking the charm
> > of the TRACE_EVENT(), which charm is that it's easy to implement (hopefully).
> >
> > Everyone knows the printk formats. And I guess this new thing is
> > easy and quick to learn. But because it's a new unknown
> > language, the TRACE_EVENT will become less readable, less
> > reachable for newcomers in TRACE_EVENT.
>
> I must also say I don't particularly like it. printk is nice and
> easy an everybody knows it, but it's not quite flexible enough as
> we might have to do all kinds of conversions on the reader side.
> What might be a better idea is to just have C function pointer for
> output conversions that could be put into the a file in debugfs
> and used by the binary trace buffer reader. Or maybe not as we
> would pull in too many depenencies.

Another bigger problem with the new tag format, beyond introducing
an arbitrary descriptor language (which is easy to mess up) is the
loss of type checking.

With the tags the field printouts can go stray easily - while with
TP_printk() we had printf type checking. (which, as imperfect as it
may be to specify a format, does create a real connection between
the record and the output format specification.)

> I think we should go with the printk solution for 2.6.31 and use
> the full development cycle for 2.6.32 to come up with something
> better.
>
> As soon as a couple of large subsystems use the even tracer we
> also have a broader base examples to see how new syntax works on
> them.

I think much of the tooling problem could be solved with a little
trick: the format string can be injected into an artificial .c file
(runtime), and the tool could compile that .c file (in user-space)
and get access to the result.

For example, one of the more complex block tracepoints,
/debug/tracing/events/block/block_bio_backmerge:

print fmt: "%d,%d %s %llu + %u [%s]", ((unsigned int) ((REC->dev) >>
20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), REC->rwbs,
(unsigned long long)REC->sector, REC->nr_sector, REC->comm

when pasted verbatim into the stub below, produces:

0,6 a 7 + 8 [abc]

Note that i pasted the format string into the code below unchanged,
and i used the format descriptor to create the record type. (this
too is easy to automate).

If this is generated into the following function:

format_block_bio_backmerge(struct record *rec);

and a small dynamic library is built out of it, tooling can use
dlopen() to load those format printing stubs.

It's all pretty straightforward and can be used for arbitrarily
complex formats.

And i kind of like the whole notion on a design level as weell: the
kernel exporting C source code for tools :-)

Ingo

------------------>

struct record {
unsigned short common_type;
unsigned char common_flags;
unsigned char common_preempt;
int common_pid;
int common_tgid;
int dev;
unsigned long long sector;
unsigned int nr_sector;
char rwbs[6];
char comm[16];
} this_record = { 1, 2, 3, 4, 5, 6, 7, 8, { 'a', }, "abc" };

void main(void)
{
struct record *REC = &this_record;

printf("%d,%d %s %llu + %u [%s]", ((unsigned int) ((REC->dev) >> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), REC->rwbs, (unsigned long long)REC->sector, REC->nr_sector, REC->comm);
}

2009-06-10 11:31:14

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing

2009/6/10 Ingo Molnar <[email protected]>:
>
> * Christoph Hellwig <[email protected]> wrote:
>
>> On Tue, Jun 09, 2009 at 09:22:01PM +0200, Frederic Weisbecker wrote:
>> > But I wonder if the above new language is not breaking the charm
>> > of the TRACE_EVENT(), which charm is that it's easy to implement (hopefully).
>> >
>> > Everyone knows the printk formats. And I guess this new thing is
>> > easy and quick to learn. But because it's a new unknown
>> > language, the TRACE_EVENT will become less readable, less
>> > reachable for newcomers in TRACE_EVENT.
>>
>> I must also say I don't particularly like it. ?printk is nice and
>> easy an everybody knows it, but it's not quite flexible enough as
>> we might have to do all kinds of conversions on the reader side.
>> What might be a better idea is to just have C function pointer for
>> output conversions that could be put into the a file in debugfs
>> and used by the binary trace buffer reader. ?Or maybe not as we
>> would pull in too many depenencies.
>
> Another bigger problem with the new tag format, beyond introducing
> an arbitrary descriptor language (which is easy to mess up) is the
> loss of type checking.
>
> With the tags the field printouts can go stray easily - while with
> TP_printk() we had printf type checking. (which, as imperfect as it
> may be to specify a format, does create a real connection between
> the record and the output format specification.)
>
>> I think we should go with the printk solution for 2.6.31 and use
>> the full development cycle for 2.6.32 to come up with something
>> better.
>>
>> As soon as a couple of large subsystems use the even tracer we
>> also have a broader base examples to see how new syntax works on
>> them.
>
> I think much of the tooling problem could be solved with a little
> trick: the format string can be injected into an artificial .c file
> (runtime), and the tool could compile that .c file (in user-space)
> and get access to the result.
>
> For example, one of the more complex block tracepoints,
> /debug/tracing/events/block/block_bio_backmerge:
>
> print fmt: "%d,%d %s %llu + %u [%s]", ((unsigned int) ((REC->dev) >>
> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), REC->rwbs,
> (unsigned long long)REC->sector, REC->nr_sector, REC->comm
>
> when pasted verbatim into the stub below, produces:
>
> ? 0,6 a 7 + 8 [abc]
>
> Note that i pasted the format string into the code below unchanged,
> and i used the format descriptor to create the record type. (this
> too is easy to automate).
>
> If this is generated into the following function:
>
> ?format_block_bio_backmerge(struct record *rec);
>
> and a small dynamic library is built out of it, tooling can use
> dlopen() to load those format printing stubs.
>
> It's all pretty straightforward and can be used for arbitrarily
> complex formats.



Hmm



> And i kind of like the whole notion on a design level as weell: the
> kernel exporting C source code for tools :-)
>
> ? ? ? ?Ingo
>
> ------------------>
>
> struct record {
> ? ? ? ?unsigned short common_type;
> ? ? ? ?unsigned char common_flags;
> ? ? ? ?unsigned char common_preempt;
> ? ? ? ?int common_pid;
> ? ? ? ?int common_tgid;
> ? ? ? ?int dev;
> ? ? ? ?unsigned long long sector;
> ? ? ? ?unsigned int nr_sector;
> ? ? ? ?char rwbs[6];
> ? ? ? ?char comm[16];
> } this_record = { 1, 2, 3, 4, 5, 6, 7, 8, { 'a', }, "abc" };
>
> void main(void)
> {
> ? ? ? ?struct record *REC = &this_record;
>
> ? ? ? ?printf("%d,%d %s %llu + %u [%s]", ((unsigned int) ((REC->dev) >> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), REC->rwbs, (unsigned long long)REC->sector, REC->nr_sector, REC->comm);
> }


Yeah it's a quite nice idea.
But it's assuming everyone parses binary files using C programs.
Usually, such parsing
more likely involves the use of scripting languages.

2009-06-10 11:51:51

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing

Le 10 juin 2009 13:31, Fr?d?ric Weisbecker<[email protected]> a ?crit :
> 2009/6/10 Ingo Molnar <[email protected]>:
>>
>> * Christoph Hellwig <[email protected]> wrote:
>>
>>> On Tue, Jun 09, 2009 at 09:22:01PM +0200, Frederic Weisbecker wrote:
>>> > But I wonder if the above new language is not breaking the charm
>>> > of the TRACE_EVENT(), which charm is that it's easy to implement (hopefully).
>>> >
>>> > Everyone knows the printk formats. And I guess this new thing is
>>> > easy and quick to learn. But because it's a new unknown
>>> > language, the TRACE_EVENT will become less readable, less
>>> > reachable for newcomers in TRACE_EVENT.
>>>
>>> I must also say I don't particularly like it. ?printk is nice and
>>> easy an everybody knows it, but it's not quite flexible enough as
>>> we might have to do all kinds of conversions on the reader side.
>>> What might be a better idea is to just have C function pointer for
>>> output conversions that could be put into the a file in debugfs
>>> and used by the binary trace buffer reader. ?Or maybe not as we
>>> would pull in too many depenencies.
>>
>> Another bigger problem with the new tag format, beyond introducing
>> an arbitrary descriptor language (which is easy to mess up) is the
>> loss of type checking.
>>
>> With the tags the field printouts can go stray easily - while with
>> TP_printk() we had printf type checking. (which, as imperfect as it
>> may be to specify a format, does create a real connection between
>> the record and the output format specification.)
>>
>>> I think we should go with the printk solution for 2.6.31 and use
>>> the full development cycle for 2.6.32 to come up with something
>>> better.
>>>
>>> As soon as a couple of large subsystems use the even tracer we
>>> also have a broader base examples to see how new syntax works on
>>> them.
>>
>> I think much of the tooling problem could be solved with a little
>> trick: the format string can be injected into an artificial .c file
>> (runtime), and the tool could compile that .c file (in user-space)
>> and get access to the result.
>>
>> For example, one of the more complex block tracepoints,
>> /debug/tracing/events/block/block_bio_backmerge:
>>
>> print fmt: "%d,%d %s %llu + %u [%s]", ((unsigned int) ((REC->dev) >>
>> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), REC->rwbs,
>> (unsigned long long)REC->sector, REC->nr_sector, REC->comm
>>
>> when pasted verbatim into the stub below, produces:
>>
>> ? 0,6 a 7 + 8 [abc]
>>
>> Note that i pasted the format string into the code below unchanged,
>> and i used the format descriptor to create the record type. (this
>> too is easy to automate).
>>
>> If this is generated into the following function:
>>
>> ?format_block_bio_backmerge(struct record *rec);
>>
>> and a small dynamic library is built out of it, tooling can use
>> dlopen() to load those format printing stubs.
>>
>> It's all pretty straightforward and can be used for arbitrarily
>> complex formats.
>
>
>
> Hmm
>
>
>
>> And i kind of like the whole notion on a design level as weell: the
>> kernel exporting C source code for tools :-)
>>
>> ? ? ? ?Ingo
>>
>> ------------------>
>>
>> struct record {
>> ? ? ? ?unsigned short common_type;
>> ? ? ? ?unsigned char common_flags;
>> ? ? ? ?unsigned char common_preempt;
>> ? ? ? ?int common_pid;
>> ? ? ? ?int common_tgid;
>> ? ? ? ?int dev;
>> ? ? ? ?unsigned long long sector;
>> ? ? ? ?unsigned int nr_sector;
>> ? ? ? ?char rwbs[6];
>> ? ? ? ?char comm[16];
>> } this_record = { 1, 2, 3, 4, 5, 6, 7, 8, { 'a', }, "abc" };
>>
>> void main(void)
>> {
>> ? ? ? ?struct record *REC = &this_record;
>>
>> ? ? ? ?printf("%d,%d %s %llu + %u [%s]", ((unsigned int) ((REC->dev) >> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), REC->rwbs, (unsigned long long)REC->sector, REC->nr_sector, REC->comm);
>> }
>
>
> Yeah it's a quite nice idea.
> But it's assuming everyone parses binary files using C programs.
> Usually, such parsing
> more likely involves the use of scripting languages.
>

While thinking more about it your idea can produce a whole parser
automatically and dynamically written.
So actually, looks like a right direction :-)

2009-06-10 12:18:24

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing


On Wed, 10 Jun 2009, Ingo Molnar wrote:
>
> And i kind of like the whole notion on a design level as weell: the
> kernel exporting C source code for tools :-)
>
> Ingo
>
> ------------------>
>
> struct record {
> unsigned short common_type;
> unsigned char common_flags;
> unsigned char common_preempt;
> int common_pid;
> int common_tgid;
> int dev;
> unsigned long long sector;
> unsigned int nr_sector;
> char rwbs[6];
> char comm[16];
> } this_record = { 1, 2, 3, 4, 5, 6, 7, 8, { 'a', }, "abc" };
>
> void main(void)
> {
> struct record *REC = &this_record;
>
> printf("%d,%d %s %llu + %u [%s]", ((unsigned int) ((REC->dev) >> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), REC->rwbs, (unsigned long long)REC->sector, REC->nr_sector, REC->comm);
> }

I actually tried this first. But it breaks once we start getting types
into the code:

print fmt: "call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s",
REC->call_site, REC->ptr, REC->bytes_req, REC->bytes_alloc,
(REC->gfp_flags) ? ({ static const struct trace_print_flags
flags[] = { {(unsigned long)(((gfp_t)0x10u) | ((gfp_t)0x40u) |
((gfp_t)0x80u) | ((gfp_t)0x20000u) | ((gfp_t)0x02u) |
((gfp_t)0x100000u)), "GFP_HIGHUSER_MOVABLE"}, [...]

Will break on gfp_t. We also have cases where the enum name slips in too:

print fmt: "softirq=%d action=%s", REC->vec, ({ static const struct trace_print_flags symbols[] =
{ { HI_SOFTIRQ, "HI" }, { TIMER_SOFTIRQ, "TIMER" }, { NET_TX_SOFTIRQ, "NET_TX" },
{ NET_RX_SOFTIRQ, "NET_RX" }, { BLOCK_SOFTIRQ, "BLOCK" },
{ TASKLET_SOFTIRQ, "TASKLET" }, { SCHED_SOFTIRQ, "SCHED" },
{ HRTIMER_SOFTIRQ, "HRTIMER" }, { RCU_SOFTIRQ, "RCU" },
{ -1, ((void *)0) }}; ftrace_print_symbols_seq(p, REC->vec, symbols); })

Yes we can add special types for things like gfp_t, but as we get more and
more users of TRACE_EVENT, the tools will never be able to keep up.

-- Steve

2009-06-10 12:47:24

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing


On Wed, 10 Jun 2009, Christoph Hellwig wrote:

> On Tue, Jun 09, 2009 at 09:22:01PM +0200, Frederic Weisbecker wrote:
> > But I wonder if the above new language is not breaking the charm
> > of the TRACE_EVENT(), which charm is that it's easy to implement (hopefully).
> >
> > Everyone knows the printk formats. And I guess this new thing is easy and
> > quick to learn. But because it's a new unknown language, the TRACE_EVENT
> > will become less readable, less reachable for newcomers in TRACE_EVENT.
>
> I must also say I don't particularly like it. printk is nice and easy
> an everybody knows it, but it's not quite flexible enough as we might
> have to do all kinds of conversions on the reader side. What might be
> a better idea is to just have C function pointer for output conversions
> that could be put into the a file in debugfs and used by the binary
> trace buffer reader. Or maybe not as we would pull in too many
> depenencies.

Yeah, having a printk function in debugfs that is controlled by the kernel
sounds wrong on so many levels ;-)

>
> I think we should go with the printk solution for 2.6.31 and use the
> full development cycle for 2.6.32 to come up with something better.
>
> As soon as a couple of large subsystems use the even tracer we also
> have a broader base examples to see how new syntax works on them.

Fair enough. I'm just worried that we'll start to get users that depend on
the printk format. Perhaps the better idea is to remove the printk format
from the debugfs format file, and force all tools to add their own for
every event? At least until we can come up with a solution to dynamically do this?

-- Steve

2009-06-10 14:32:58

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing

* Ingo Molnar ([email protected]) wrote:
>
> * Christoph Hellwig <[email protected]> wrote:
>
> > On Tue, Jun 09, 2009 at 09:22:01PM +0200, Frederic Weisbecker wrote:
> > > But I wonder if the above new language is not breaking the charm
> > > of the TRACE_EVENT(), which charm is that it's easy to implement (hopefully).
> > >
> > > Everyone knows the printk formats. And I guess this new thing is
> > > easy and quick to learn. But because it's a new unknown
> > > language, the TRACE_EVENT will become less readable, less
> > > reachable for newcomers in TRACE_EVENT.
> >
> > I must also say I don't particularly like it. printk is nice and
> > easy an everybody knows it, but it's not quite flexible enough as
> > we might have to do all kinds of conversions on the reader side.
> > What might be a better idea is to just have C function pointer for
> > output conversions that could be put into the a file in debugfs
> > and used by the binary trace buffer reader. Or maybe not as we
> > would pull in too many depenencies.
>
> Another bigger problem with the new tag format, beyond introducing
> an arbitrary descriptor language (which is easy to mess up) is the
> loss of type checking.
>
> With the tags the field printouts can go stray easily - while with
> TP_printk() we had printf type checking. (which, as imperfect as it
> may be to specify a format, does create a real connection between
> the record and the output format specification.)
>
> > I think we should go with the printk solution for 2.6.31 and use
> > the full development cycle for 2.6.32 to come up with something
> > better.
> >
> > As soon as a couple of large subsystems use the even tracer we
> > also have a broader base examples to see how new syntax works on
> > them.
>
> I think much of the tooling problem could be solved with a little
> trick: the format string can be injected into an artificial .c file
> (runtime), and the tool could compile that .c file (in user-space)
> and get access to the result.
>
> For example, one of the more complex block tracepoints,
> /debug/tracing/events/block/block_bio_backmerge:
>
> print fmt: "%d,%d %s %llu + %u [%s]", ((unsigned int) ((REC->dev) >>
> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), REC->rwbs,
> (unsigned long long)REC->sector, REC->nr_sector, REC->comm
>
> when pasted verbatim into the stub below, produces:
>
> 0,6 a 7 + 8 [abc]
>
> Note that i pasted the format string into the code below unchanged,
> and i used the format descriptor to create the record type. (this
> too is easy to automate).
>
> If this is generated into the following function:
>
> format_block_bio_backmerge(struct record *rec);
>
> and a small dynamic library is built out of it, tooling can use
> dlopen() to load those format printing stubs.
>
> It's all pretty straightforward and can be used for arbitrarily
> complex formats.
>
> And i kind of like the whole notion on a design level as weell: the
> kernel exporting C source code for tools :-)
>

Hrm, it's problematic for users who run the userspace analysis tools on
different machine than their kernel. And also problematic for 64-bits
kernel/32-bits userland. A lot of embedded developers run on very
resource limited ARM boards and have to analyse the traces on a
different machine.

It would be much more flexible if we parse the event format description
from a userland tool than trying to use C as a direct way to export
trace metadata, which would cause us to build an ABI-specific,
non-portable, analyser tool.

Mathieu


> Ingo
>
> ------------------>
>
> struct record {
> unsigned short common_type;
> unsigned char common_flags;
> unsigned char common_preempt;
> int common_pid;
> int common_tgid;
> int dev;
> unsigned long long sector;
> unsigned int nr_sector;
> char rwbs[6];
> char comm[16];
> } this_record = { 1, 2, 3, 4, 5, 6, 7, 8, { 'a', }, "abc" };
>
> void main(void)
> {
> struct record *REC = &this_record;
>
> printf("%d,%d %s %llu + %u [%s]", ((unsigned int) ((REC->dev) >> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), REC->rwbs, (unsigned long long)REC->sector, REC->nr_sector, REC->comm);
> }
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-06-10 17:17:54

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing


* Steven Rostedt <[email protected]> wrote:

>
> On Wed, 10 Jun 2009, Ingo Molnar wrote:
> >
> > And i kind of like the whole notion on a design level as weell: the
> > kernel exporting C source code for tools :-)
> >
> > Ingo
> >
> > ------------------>
> >
> > struct record {
> > unsigned short common_type;
> > unsigned char common_flags;
> > unsigned char common_preempt;
> > int common_pid;
> > int common_tgid;
> > int dev;
> > unsigned long long sector;
> > unsigned int nr_sector;
> > char rwbs[6];
> > char comm[16];
> > } this_record = { 1, 2, 3, 4, 5, 6, 7, 8, { 'a', }, "abc" };
> >
> > void main(void)
> > {
> > struct record *REC = &this_record;
> >
> > printf("%d,%d %s %llu + %u [%s]", ((unsigned int) ((REC->dev) >> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), REC->rwbs, (unsigned long long)REC->sector, REC->nr_sector, REC->comm);
> > }
>
> I actually tried this first. But it breaks once we start getting types
> into the code:
>
> print fmt: "call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s",
> REC->call_site, REC->ptr, REC->bytes_req, REC->bytes_alloc,
> (REC->gfp_flags) ? ({ static const struct trace_print_flags
> flags[] = { {(unsigned long)(((gfp_t)0x10u) | ((gfp_t)0x40u) |
> ((gfp_t)0x80u) | ((gfp_t)0x20000u) | ((gfp_t)0x02u) |
> ((gfp_t)0x100000u)), "GFP_HIGHUSER_MOVABLE"}, [...]
>
> Will break on gfp_t. [...]

It wont break if we introduce a couple of common-sense types into
the parsing/translation code. gfp_t is well-known.

Modules wont be able to generate new dynamic types - but that's OK i
think, existing C types and common kernel types (and anything else
we add) ought to be plenty enough.

> [...] We also have cases where the enum name slips in too:
>
> print fmt: "softirq=%d action=%s", REC->vec, ({ static const struct trace_print_flags symbols[] =
> { { HI_SOFTIRQ, "HI" }, { TIMER_SOFTIRQ, "TIMER" }, { NET_TX_SOFTIRQ, "NET_TX" },
> { NET_RX_SOFTIRQ, "NET_RX" }, { BLOCK_SOFTIRQ, "BLOCK" },
> { TASKLET_SOFTIRQ, "TASKLET" }, { SCHED_SOFTIRQ, "SCHED" },
> { HRTIMER_SOFTIRQ, "HRTIMER" }, { RCU_SOFTIRQ, "RCU" },
> { -1, ((void *)0) }}; ftrace_print_symbols_seq(p, REC->vec, symbols); })
>
> Yes we can add special types for things like gfp_t, but as we get
> more and more users of TRACE_EVENT, the tools will never be able
> to keep up.

i still disagree. The tool will have to know about gfp_t in the tag
language too. So there's always going to be a constant expansion of
the type space.

The point is that the number of new types is an order of magnitude
less than the number of new tracepoints.

Also, with tools like perf in the kernel repo under tools/perf/,
we'll be able to keep up with mainline types very flexibly and very
accurately.

Ingo

2009-06-10 17:56:59

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH 2/5] tracing/events: nicer print format for parsing


On Wed, 10 Jun 2009, Ingo Molnar wrote:
> >
> > I actually tried this first. But it breaks once we start getting types
> > into the code:
> >
> > print fmt: "call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s",
> > REC->call_site, REC->ptr, REC->bytes_req, REC->bytes_alloc,
> > (REC->gfp_flags) ? ({ static const struct trace_print_flags
> > flags[] = { {(unsigned long)(((gfp_t)0x10u) | ((gfp_t)0x40u) |
> > ((gfp_t)0x80u) | ((gfp_t)0x20000u) | ((gfp_t)0x02u) |
> > ((gfp_t)0x100000u)), "GFP_HIGHUSER_MOVABLE"}, [...]
> >
> > Will break on gfp_t. [...]
>
> It wont break if we introduce a couple of common-sense types into
> the parsing/translation code. gfp_t is well-known.
>
> Modules wont be able to generate new dynamic types - but that's OK i
> think, existing C types and common kernel types (and anything else
> we add) ought to be plenty enough.
>
> > [...] We also have cases where the enum name slips in too:
> >
> > print fmt: "softirq=%d action=%s", REC->vec, ({ static const struct trace_print_flags symbols[] =
> > { { HI_SOFTIRQ, "HI" }, { TIMER_SOFTIRQ, "TIMER" }, { NET_TX_SOFTIRQ, "NET_TX" },
> > { NET_RX_SOFTIRQ, "NET_RX" }, { BLOCK_SOFTIRQ, "BLOCK" },
> > { TASKLET_SOFTIRQ, "TASKLET" }, { SCHED_SOFTIRQ, "SCHED" },
> > { HRTIMER_SOFTIRQ, "HRTIMER" }, { RCU_SOFTIRQ, "RCU" },
> > { -1, ((void *)0) }}; ftrace_print_symbols_seq(p, REC->vec, symbols); })
> >
> > Yes we can add special types for things like gfp_t, but as we get
> > more and more users of TRACE_EVENT, the tools will never be able
> > to keep up.
>
> i still disagree. The tool will have to know about gfp_t in the tag
> language too. So there's always going to be a constant expansion of
> the type space.
>
> The point is that the number of new types is an order of magnitude
> less than the number of new tracepoints.
>
> Also, with tools like perf in the kernel repo under tools/perf/,
> we'll be able to keep up with mainline types very flexibly and very
> accurately.

I rather include a C parser than require dynamic compiling of the code to
read the file. The want to be able to read the binary data from other
systems. The output could include the format, and the binary data.

I'm working on ways to make the printk output in the format file not so
ugly. There's tricks I can still do with __get_str and friends. But this
still does not solve the issues with enums.

But my CPP skills are starting to incur another boost, lets see what else
I can come up with ;-)

-- Steve

2009-06-10 18:39:33

by Steven Rostedt

[permalink] [raw]
Subject: [PATCH][GIT PULL] tracing: do not translate event helper macros in print format


Ingo,

This looks like it changed more than it really did, I only moved the one
macro above another and added a few lines:

+#undef __print_symbolic
+#undef __get_dynamic_array
+#undef __get_str
+

Please pull the latest tip/tracing/ftrace-1 tree, which can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
tip/tracing/ftrace-1


Steven Rostedt (1):
tracing: do not translate event helper macros in print format

----
include/trace/ftrace.h | 158 ++++++++++++++++++++++++-----------------------
1 files changed, 81 insertions(+), 77 deletions(-)
---------------------------
commit 6ff9a64d2aaa6eae396adc95e9c91c0cbfa6dbe4
Author: Steven Rostedt <[email protected]>
Date: Wed Jun 10 14:28:34 2009 -0400

tracing: do not translate event helper macros in print format

By moving the macro that creates the print format code above the
defining of the event macro helpers (__get_str, __print_symbolic,
and __get_dynamic_array), we get a little cleaner print format.

Instead of:

(char *)((void *)REC + REC->__data_loc_name)

we get:

__get_str(name)

Instead of:

({ static const struct trace_print_flags symbols[] = { { HI_SOFTIRQ, "HI" }, {

we get:

__print_symbolic(REC->vec, { HI_SOFTIRQ, "HI" }, {

Signed-off-by: Steven Rostedt <[email protected]>

diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 40ede4d..1867553 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -81,6 +81,87 @@
#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)

/*
+ * Setup the showing format of trace point.
+ *
+ * int
+ * ftrace_format_##call(struct trace_seq *s)
+ * {
+ * struct ftrace_raw_##call field;
+ * int ret;
+ *
+ * ret = trace_seq_printf(s, #type " " #item ";"
+ * " offset:%u; size:%u;\n",
+ * offsetof(struct ftrace_raw_##call, item),
+ * sizeof(field.type));
+ *
+ * }
+ */
+
+#undef TP_STRUCT__entry
+#define TP_STRUCT__entry(args...) args
+
+#undef __field
+#define __field(type, item) \
+ ret = trace_seq_printf(s, "\tfield:" #type " " #item ";\t" \
+ "offset:%u;\tsize:%u;\n", \
+ (unsigned int)offsetof(typeof(field), item), \
+ (unsigned int)sizeof(field.item)); \
+ if (!ret) \
+ return 0;
+
+#undef __array
+#define __array(type, item, len) \
+ ret = trace_seq_printf(s, "\tfield:" #type " " #item "[" #len "];\t" \
+ "offset:%u;\tsize:%u;\n", \
+ (unsigned int)offsetof(typeof(field), item), \
+ (unsigned int)sizeof(field.item)); \
+ if (!ret) \
+ return 0;
+
+#undef __dynamic_array
+#define __dynamic_array(type, item, len) \
+ ret = trace_seq_printf(s, "\tfield:__data_loc " #item ";\t" \
+ "offset:%u;\tsize:%u;\n", \
+ (unsigned int)offsetof(typeof(field), \
+ __data_loc_##item), \
+ (unsigned int)sizeof(field.__data_loc_##item)); \
+ if (!ret) \
+ return 0;
+
+#undef __string
+#define __string(item, src) __dynamic_array(char, item, -1)
+
+#undef __entry
+#define __entry REC
+
+#undef __print_symbolic
+#undef __get_dynamic_array
+#undef __get_str
+
+#undef TP_printk
+#define TP_printk(fmt, args...) "%s, %s\n", #fmt, __stringify(args)
+
+#undef TP_fast_assign
+#define TP_fast_assign(args...) args
+
+#undef TRACE_EVENT
+#define TRACE_EVENT(call, proto, args, tstruct, func, print) \
+static int \
+ftrace_format_##call(struct trace_seq *s) \
+{ \
+ struct ftrace_raw_##call field __attribute__((unused)); \
+ int ret = 0; \
+ \
+ tstruct; \
+ \
+ trace_seq_printf(s, "\nprint fmt: " print); \
+ \
+ return ret; \
+}
+
+#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
+
+/*
* Stage 3 of the trace events.
*
* Override the macros in <trace/trace_events.h> to include the following:
@@ -179,83 +260,6 @@ ftrace_raw_output_##call(struct trace_iterator *iter, int flags) \

#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)

-/*
- * Setup the showing format of trace point.
- *
- * int
- * ftrace_format_##call(struct trace_seq *s)
- * {
- * struct ftrace_raw_##call field;
- * int ret;
- *
- * ret = trace_seq_printf(s, #type " " #item ";"
- * " offset:%u; size:%u;\n",
- * offsetof(struct ftrace_raw_##call, item),
- * sizeof(field.type));
- *
- * }
- */
-
-#undef TP_STRUCT__entry
-#define TP_STRUCT__entry(args...) args
-
-#undef __field
-#define __field(type, item) \
- ret = trace_seq_printf(s, "\tfield:" #type " " #item ";\t" \
- "offset:%u;\tsize:%u;\n", \
- (unsigned int)offsetof(typeof(field), item), \
- (unsigned int)sizeof(field.item)); \
- if (!ret) \
- return 0;
-
-#undef __array
-#define __array(type, item, len) \
- ret = trace_seq_printf(s, "\tfield:" #type " " #item "[" #len "];\t" \
- "offset:%u;\tsize:%u;\n", \
- (unsigned int)offsetof(typeof(field), item), \
- (unsigned int)sizeof(field.item)); \
- if (!ret) \
- return 0;
-
-#undef __dynamic_array
-#define __dynamic_array(type, item, len) \
- ret = trace_seq_printf(s, "\tfield:__data_loc " #item ";\t" \
- "offset:%u;\tsize:%u;\n", \
- (unsigned int)offsetof(typeof(field), \
- __data_loc_##item), \
- (unsigned int)sizeof(field.__data_loc_##item)); \
- if (!ret) \
- return 0;
-
-#undef __string
-#define __string(item, src) __dynamic_array(char, item, -1)
-
-#undef __entry
-#define __entry REC
-
-#undef TP_printk
-#define TP_printk(fmt, args...) "%s, %s\n", #fmt, __stringify(args)
-
-#undef TP_fast_assign
-#define TP_fast_assign(args...) args
-
-#undef TRACE_EVENT
-#define TRACE_EVENT(call, proto, args, tstruct, func, print) \
-static int \
-ftrace_format_##call(struct trace_seq *s) \
-{ \
- struct ftrace_raw_##call field __attribute__((unused)); \
- int ret = 0; \
- \
- tstruct; \
- \
- trace_seq_printf(s, "\nprint fmt: " print); \
- \
- return ret; \
-}
-
-#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
-
#undef __field
#define __field(type, item) \
ret = trace_define_field(event_call, #type, #item, \

2009-06-10 20:48:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH][GIT PULL] tracing: do not translate event helper macros in print format


* Steven Rostedt <[email protected]> wrote:

> Ingo,
>
> This looks like it changed more than it really did, I only moved
> the one macro above another and added a few lines:
>
> +#undef __print_symbolic
> +#undef __get_dynamic_array
> +#undef __get_str
> +
>
> Please pull the latest tip/tracing/ftrace-1 tree, which can be found at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
> tip/tracing/ftrace-1
>
>
> Steven Rostedt (1):
> tracing: do not translate event helper macros in print format
>
> ----
> include/trace/ftrace.h | 158 ++++++++++++++++++++++++-----------------------
> 1 files changed, 81 insertions(+), 77 deletions(-)
> ---------------------------
> commit 6ff9a64d2aaa6eae396adc95e9c91c0cbfa6dbe4
> Author: Steven Rostedt <[email protected]>
> Date: Wed Jun 10 14:28:34 2009 -0400
>
> tracing: do not translate event helper macros in print format
>
> By moving the macro that creates the print format code above the
> defining of the event macro helpers (__get_str, __print_symbolic,
> and __get_dynamic_array), we get a little cleaner print format.
>
> Instead of:
>
> (char *)((void *)REC + REC->__data_loc_name)
>
> we get:
>
> __get_str(name)
>
> Instead of:
>
> ({ static const struct trace_print_flags symbols[] = { { HI_SOFTIRQ, "HI" }, {
>
> we get:
>
> __print_symbolic(REC->vec, { HI_SOFTIRQ, "HI" }, {

Yeah, that's really nice! Pulled, thanks Steve!

Ingo

2009-06-11 12:52:44

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH][GIT PULL] tracing: do not translate event helper macros in print format

On Wed, Jun 10, 2009 at 10:48:02PM +0200, Ingo Molnar wrote:
> > This looks like it changed more than it really did, I only moved
> > the one macro above another and added a few lines:
> >
> > +#undef __print_symbolic
> > +#undef __get_dynamic_array
> > +#undef __get_str

> > tracing: do not translate event helper macros in print format
> >
> > By moving the macro that creates the print format code above the
> > defining of the event macro helpers (__get_str, __print_symbolic,
> > and __get_dynamic_array), we get a little cleaner print format.

__print_flags should also be added, shouldn't it?

Maybe also le/be*_to_cpu/cpu_to_le/be*? and major/minor?

We should also document those helpers that need special treatment
somewhere?

2009-06-11 13:04:21

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH][GIT PULL] tracing: do not translate event helper macros in print format


On Thu, 11 Jun 2009, Christoph Hellwig wrote:

> On Wed, Jun 10, 2009 at 10:48:02PM +0200, Ingo Molnar wrote:
> > > This looks like it changed more than it really did, I only moved
> > > the one macro above another and added a few lines:
> > >
> > > +#undef __print_symbolic
> > > +#undef __get_dynamic_array
> > > +#undef __get_str
>
> > > tracing: do not translate event helper macros in print format
> > >
> > > By moving the macro that creates the print format code above the
> > > defining of the event macro helpers (__get_str, __print_symbolic,
> > > and __get_dynamic_array), we get a little cleaner print format.
>
> __print_flags should also be added, shouldn't it?

Darn, I knew I forgot one :-/

>
> Maybe also le/be*_to_cpu/cpu_to_le/be*? and major/minor?

We can only include macros that we define later. Otherwise we undefine
them for when we need them.

-- Steve

>
> We should also document those helpers that need special treatment
> somewhere?
>
>