Jens, just a FYI about this stuff, not yet ready for reviewing.
Hi Fr?d?ric, Steven,
Have you guys ever get into this kind of situation?
[root@mica linux-2.6-tip]# cat /sys/kernel/debug/tracing/trace | tail
kjournald-480 [000] 593.219805: C W 2149459 + 8 [0]
<idle>-0 [000] 593.219849: A WS 2149467 + 8 <- (8,2) 44952
kjournald-480 [000] 607.210959: Q WS 2149467 + 8
kjournald-480 [000] 607.210960: G WS 2149467 + 8
kjournald-480 [000] 607.210963: P N
kjournald-480 [000] 607.210964: I W 2149467 + 8
kjournald-480 [000] 607.210965: D W 2149467 + 8
kjournald-480 [000] 607.210967: U N 1
kjournald-480 [000] 607.210971: C W 2149467 + 8 [0]
<idle>-0 [000] 607.211049: [root@mica linux-2.6-tip]#
[root@mica linux-2.6-tip]# head /sys/kernel/debug/tracing/trace
# tracer: blk
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
A W 2147707 + 8 <- (8,2) 43192
kjournald-480 [000] 254.428659: Q W 2147707 + 8
kjournald-480 [000] 254.428661: G W 2147707 + 8
kjournald-480 [000] 254.428663: I W 100410211 + 8
pdflush-261 [000] 147.328259: P N
kjournald-480 [000] 254.428664: I W 2147707 + 8
[root@mica linux-2.6-tip]#
This is happening in an experiment I'm doing in porting blktrace to the
tracing/ring_buffer infrastructure, the problem always happens at this
function:
static int blk_log_remap(struct trace_seq *s, struct blk_io_entry *t)
{
char rwbs[6];
struct blk_io_trace_remap r = { .device = 0, };
get_pdu_remap(t, &r);
t->device = r.device_from;
fill_rwbs(rwbs, t);
return trace_seq_printf(s, " A %3s %llu + %u <- (%d,%d) %llu\n",
rwbs, (unsigned long long)t->sector,
t_sec(t), MAJOR(r.device), MINOR(r.device),
(unsigned long long)r.sector);
}
And the registered .print_line routine does check the return of
trace_seq_printf:
if (!ret)
return TRACE_TYPE_PARTIAL_LINE;
}
default:
return TRACE_TYPE_UNHANDLED;
}
return TRACE_TYPE_HANDLED;
}
Full patch, still WIPish:
http://oops.ghostprotocols.net:81/acme/blkftrace.patch
- Arnaldo
2008/12/16 Arnaldo Carvalho de Melo <[email protected]>:
> Jens, just a FYI about this stuff, not yet ready for reviewing.
>
> Hi Fr?d?ric, Steven,
>
> Have you guys ever get into this kind of situation?
>
> [root@mica linux-2.6-tip]# cat /sys/kernel/debug/tracing/trace | tail
> kjournald-480 [000] 593.219805: C W 2149459 + 8 [0]
> <idle>-0 [000] 593.219849: A WS 2149467 + 8 <- (8,2) 44952
> kjournald-480 [000] 607.210959: Q WS 2149467 + 8
> kjournald-480 [000] 607.210960: G WS 2149467 + 8
> kjournald-480 [000] 607.210963: P N
> kjournald-480 [000] 607.210964: I W 2149467 + 8
> kjournald-480 [000] 607.210965: D W 2149467 + 8
> kjournald-480 [000] 607.210967: U N 1
> kjournald-480 [000] 607.210971: C W 2149467 + 8 [0]
> <idle>-0 [000] 607.211049: [root@mica linux-2.6-tip]#
>
> [root@mica linux-2.6-tip]# head /sys/kernel/debug/tracing/trace
> # tracer: blk
> #
> # TASK-PID CPU# TIMESTAMP FUNCTION
> # | | | | |
> A W 2147707 + 8 <- (8,2) 43192
> kjournald-480 [000] 254.428659: Q W 2147707 + 8
> kjournald-480 [000] 254.428661: G W 2147707 + 8
> kjournald-480 [000] 254.428663: I W 100410211 + 8
> pdflush-261 [000] 147.328259: P N
> kjournald-480 [000] 254.428664: I W 2147707 + 8
> [root@mica linux-2.6-tip]#
Actually, if you implement a print_line callback for your tracer,
you're not supposed to see the
task-pid, cpu number and time printed before your traces. You should
only see what you print with your tracer.
task-pid/cpu/timespamp are only printed by the default printing
functions in trace.c, ie: used by
the function tracer, the stack tracer, the sched tracer... or when a
tracer's print_line
returns TRACE_TYPE_UNHANDLED.
And that's what is happening with your tracer.
If you look at the following:
> +static enum print_line_t blk_tracer_print_line(struct trace_iterator *iter)
> +{
> + struct trace_seq *s = &iter->seq;
> + struct trace_entry *entry = iter->ent;
> +
> + switch (entry->type) {
> + case TRACE_BLOCK: {
> + int ret, act;
> + struct blk_io_entry *t;
> +
> + trace_assign_type(t, entry);
> + act = what_action(t);
> +
> + switch (act) {
> + case __BLK_TA_ISSUE:
> + ret = blk_log_generic(s, t, "D");
> + break;
> + case __BLK_TA_INSERT:
> + ret = blk_log_generic(s, t, "I");
> + break;
> + case __BLK_TA_QUEUE:
> + ret = blk_log_generic(s, t, "Q");
> + break;
> + case __BLK_TA_BOUNCE:
> + ret = blk_log_generic(s, t, "B");
> + break;
> + case __BLK_TA_BACKMERGE:
> + ret = blk_log_generic(s, t, "M");
> + break;
> + case __BLK_TA_FRONTMERGE:
> + ret = blk_log_generic(s, t, "F");
> + break;
> + case __BLK_TA_GETRQ:
> + ret = blk_log_generic(s, t, "G");
> + break;
> + case __BLK_TA_SLEEPRQ:
> + ret = blk_log_generic(s, t, "S");
> + break;
> + case __BLK_TA_REQUEUE:
> + ret = blk_log_with_error(s, t, "R");
> + break;
> + case __BLK_TA_COMPLETE:
> + ret = blk_log_with_error(s, t, "C");
> + break;
> + case __BLK_TA_PLUG:
> + ret = blk_log_action(s, t, "P");
> + break;
> + case __BLK_TA_UNPLUG_IO:
> + ret = blk_log_unplug(s, t, "U");
> + break;
> + case __BLK_TA_UNPLUG_TIMER:
> + ret = blk_log_unplug(s, t, "UT");
> + break;
> + case __BLK_TA_REMAP:
> + ret = blk_log_remap(s, t);
> + break;
> + case __BLK_TA_SPLIT:
> + ret = blk_log_split(s, t, "X");
> + break;
> + default:
> + ret = trace_seq_printf(s, "Bad pc action %x\n", act);
> + break;
> + }
> + if (!ret)
> + return TRACE_TYPE_PARTIAL_LINE;
> + }
> + default:
> + return TRACE_TYPE_UNHANDLED;
> + }
> +
> + return TRACE_TYPE_HANDLED;
> +}
You are always returning TRACE_TYPE_UNHANDLED because you missed a
break statement at the
end of your TRACE_BLOCK case :-)
So you are printing your trace, and then the default printing
functions on trace.c, assuming you haven't
handled it, will print task/cpu/timestamp for a function trace or a
stack trace or whatever...
Hence the weird output :-)