2009-04-13 21:59:37

by Theodore Ts'o

[permalink] [raw]
Subject: blktrace: event traces displayed wrong while ftrace blktrace is active

Hi Li,

Thanks for working on the ftrace blktrace code; I was wondering if you
could take a look at something. While the nop or syscall tracer is
selected ("echo nop > /sys/kernel/debug/tracing/current_tracer"), the
event traces display like this:

<...>-4491 [001] 9556.305696: jbd2_start_commit: dev dm-0:8 transaction 335442
<...>-4491 [001] 9556.305700: jbd2_commit_locking: dev dm-0:8 transaction 335442
<...>-4491 [001] 9556.305713: jbd2_commit_flushing: dev dm-0:8 transaction 335442
<...>-4491 [001] 9556.305720: jbd2_submit_inode_data: dev dm-0 transaction ino 3807454
<...>-4491 [001] 9556.305833: jbd2_commit_logging: dev dm-0:8 transaction 335442
<...>-4491 [000] 9556.307241: jbd2_end_commit: dev dm-0:8 transaction 335442 head 334303

That is, the timestamp information what CPU, what pid, etc., is present.

However, if the blk tracer is active, this information is missing:

jbd2_start_commit: dev dm-0:8 transaction 333227
jbd2_commit_locking: dev dm-0:8 transaction 333227
jbd2_commit_flushing: dev dm-0:8 transaction 333227
jbd2_submit_inode_data: dev dm-0 transaction ino 1466869
jbd2_submit_inode_data: dev dm-0 transaction ino 1310793
jbd2_commit_logging: dev dm-0:8 transaction 333227
jbd2_end_commit: dev dm-0:8 transaction 333227 head 332814

The context information for the blktrace is present:

rm-25433 [001] 9578.307485: 8,18 m N cfq25433 slice expired t=0
rm-25433 [001] 9578.307486: 8,18 m N cfq25433 put_queue
kjournald2-14059 [001] 9579.032457: 8,18 A W 5631242 + 8 <- (8,18) 3671312
kjournald2-14059 [001] 9579.032462: 8,18 Q W 5631242 + 8 [kjournald2]
kjournald2-14059 [001] 9579.032470: 8,18 G W 5631242 + 8 [kjournald2]

It's just the context information for the event traces which are
missing.

Anyway, if you could take a look at this, it would be much appreciated.
Usually I can interpolate the the timestamps from the surrounding
blktrace lines, but it would be nice if I didn't have to do that.

Regards,

- Ted


2009-04-13 22:20:30

by Ingo Molnar

[permalink] [raw]
Subject: Re: blktrace: event traces displayed wrong while ftrace blktrace is active


* Theodore Ts'o <[email protected]> wrote:

> Hi Li,
>
> Thanks for working on the ftrace blktrace code; I was wondering if you
> could take a look at something. While the nop or syscall tracer is
> selected ("echo nop > /sys/kernel/debug/tracing/current_tracer"), the
> event traces display like this:
>
> <...>-4491 [001] 9556.305696: jbd2_start_commit: dev dm-0:8 transaction 335442
> <...>-4491 [001] 9556.305700: jbd2_commit_locking: dev dm-0:8 transaction 335442
> <...>-4491 [001] 9556.305713: jbd2_commit_flushing: dev dm-0:8 transaction 335442
> <...>-4491 [001] 9556.305720: jbd2_submit_inode_data: dev dm-0 transaction ino 3807454
> <...>-4491 [001] 9556.305833: jbd2_commit_logging: dev dm-0:8 transaction 335442
> <...>-4491 [000] 9556.307241: jbd2_end_commit: dev dm-0:8 transaction 335442 head 334303
>
> That is, the timestamp information what CPU, what pid, etc., is present.
>
> However, if the blk tracer is active, this information is missing:
>
> jbd2_start_commit: dev dm-0:8 transaction 333227
> jbd2_commit_locking: dev dm-0:8 transaction 333227
> jbd2_commit_flushing: dev dm-0:8 transaction 333227
> jbd2_submit_inode_data: dev dm-0 transaction ino 1466869
> jbd2_submit_inode_data: dev dm-0 transaction ino 1310793
> jbd2_commit_logging: dev dm-0:8 transaction 333227
> jbd2_end_commit: dev dm-0:8 transaction 333227 head 332814
>
> The context information for the blktrace is present:
>
> rm-25433 [001] 9578.307485: 8,18 m N cfq25433 slice expired t=0
> rm-25433 [001] 9578.307486: 8,18 m N cfq25433 put_queue
> kjournald2-14059 [001] 9579.032457: 8,18 A W 5631242 + 8 <- (8,18) 3671312
> kjournald2-14059 [001] 9579.032462: 8,18 Q W 5631242 + 8 [kjournald2]
> kjournald2-14059 [001] 9579.032470: 8,18 G W 5631242 + 8 [kjournald2]
>
> It's just the context information for the event traces which are
> missing.
>
> Anyway, if you could take a look at this, it would be much appreciated.
> Usually I can interpolate the the timestamps from the surrounding
> blktrace lines, but it would be nice if I didn't have to do that.

Yes - this happens because event tracer events are integrated based
on the TP_printk() format in the TRACE_EVENT() definition. So the
above:

jbd2_commit_flushing: dev dm-0:8 transaction 333227

Comes from your TP_printk() definition for that event. That format
is funneled through a number of macro stages to end up being called
by kernel/trace/trace.c:print_trace_fmt().

This is printed 'as-is' to allow maximum flexibility in shaping the
TP_format() output. I suspect it might make sense to standardize
this some more and bring it in line with the native plugin printout
style.

OTOH, what makes more sense instead is to concetrate on the event
tracer itself, and making sure it does all the things you need.
Having to do 'echo blk > current_tracer' is really an interim stage
- we want the generic tracer to be able to do all these things as
well - i.e. we want to integrate blktrace some more.

Anyway, added some more Cc:s.

Ingo

2009-04-14 10:05:20

by Li Zefan

[permalink] [raw]
Subject: Re: blktrace: event traces displayed wrong while ftrace blktrace is active

Theodore Ts'o wrote:
> Hi Li,
>
> Thanks for working on the ftrace blktrace code; I was wondering if you
> could take a look at something. While the nop or syscall tracer is
> selected ("echo nop > /sys/kernel/debug/tracing/current_tracer"), the
> event traces display like this:
>
> <...>-4491 [001] 9556.305696: jbd2_start_commit: dev dm-0:8 transaction 335442
> <...>-4491 [001] 9556.305700: jbd2_commit_locking: dev dm-0:8 transaction 335442
> <...>-4491 [001] 9556.305713: jbd2_commit_flushing: dev dm-0:8 transaction 335442
> <...>-4491 [001] 9556.305720: jbd2_submit_inode_data: dev dm-0 transaction ino 3807454
> <...>-4491 [001] 9556.305833: jbd2_commit_logging: dev dm-0:8 transaction 335442
> <...>-4491 [000] 9556.307241: jbd2_end_commit: dev dm-0:8 transaction 335442 head 334303
>
> That is, the timestamp information what CPU, what pid, etc., is present.
>
> However, if the blk tracer is active, this information is missing:
>
> jbd2_start_commit: dev dm-0:8 transaction 333227
> jbd2_commit_locking: dev dm-0:8 transaction 333227
> jbd2_commit_flushing: dev dm-0:8 transaction 333227
> jbd2_submit_inode_data: dev dm-0 transaction ino 1466869
> jbd2_submit_inode_data: dev dm-0 transaction ino 1310793
> jbd2_commit_logging: dev dm-0:8 transaction 333227
> jbd2_end_commit: dev dm-0:8 transaction 333227 head 332814
>

The cause is:

static void blk_tracer_start(struct trace_array *tr)
{
blk_tracer_enabled = true;
trace_flags &= ~TRACE_ITER_CONTEXT_INFO; <--- !!
}

If TRACE_ITER_CONTEXT_INFO is set, 'comm-pid [cpu] ts' will be
printed, and vise versa.

The purpose to unset this flag is to make binary output can be
parsed by blkparse.

> The context information for the blktrace is present:
>
> rm-25433 [001] 9578.307485: 8,18 m N cfq25433 slice expired t=0
> rm-25433 [001] 9578.307486: 8,18 m N cfq25433 put_queue
> kjournald2-14059 [001] 9579.032457: 8,18 A W 5631242 + 8 <- (8,18) 3671312
> kjournald2-14059 [001] 9579.032462: 8,18 Q W 5631242 + 8 [kjournald2]
> kjournald2-14059 [001] 9579.032470: 8,18 G W 5631242 + 8 [kjournald2]
>
> It's just the context information for the event traces which are
> missing.
>
> Anyway, if you could take a look at this, it would be much appreciated.
> Usually I can interpolate the the timestamps from the surrounding
> blktrace lines, but it would be nice if I didn't have to do that.
>

2009-04-14 21:25:48

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: blktrace: event traces displayed wrong while ftrace blktrace is active

On Tue, Apr 14, 2009 at 06:03:39PM +0800, Li Zefan wrote:
> Theodore Ts'o wrote:
> > Hi Li,
> >
> > Thanks for working on the ftrace blktrace code; I was wondering if you
> > could take a look at something. While the nop or syscall tracer is
> > selected ("echo nop > /sys/kernel/debug/tracing/current_tracer"), the
> > event traces display like this:
> >
> > <...>-4491 [001] 9556.305696: jbd2_start_commit: dev dm-0:8 transaction 335442
> > <...>-4491 [001] 9556.305700: jbd2_commit_locking: dev dm-0:8 transaction 335442
> > <...>-4491 [001] 9556.305713: jbd2_commit_flushing: dev dm-0:8 transaction 335442
> > <...>-4491 [001] 9556.305720: jbd2_submit_inode_data: dev dm-0 transaction ino 3807454
> > <...>-4491 [001] 9556.305833: jbd2_commit_logging: dev dm-0:8 transaction 335442
> > <...>-4491 [000] 9556.307241: jbd2_end_commit: dev dm-0:8 transaction 335442 head 334303
> >
> > That is, the timestamp information what CPU, what pid, etc., is present.
> >
> > However, if the blk tracer is active, this information is missing:
> >
> > jbd2_start_commit: dev dm-0:8 transaction 333227
> > jbd2_commit_locking: dev dm-0:8 transaction 333227
> > jbd2_commit_flushing: dev dm-0:8 transaction 333227
> > jbd2_submit_inode_data: dev dm-0 transaction ino 1466869
> > jbd2_submit_inode_data: dev dm-0 transaction ino 1310793
> > jbd2_commit_logging: dev dm-0:8 transaction 333227
> > jbd2_end_commit: dev dm-0:8 transaction 333227 head 332814
> >
>
> The cause is:
>
> static void blk_tracer_start(struct trace_array *tr)
> {
> blk_tracer_enabled = true;
> trace_flags &= ~TRACE_ITER_CONTEXT_INFO; <--- !!
> }
>
> If TRACE_ITER_CONTEXT_INFO is set, 'comm-pid [cpu] ts' will be
> printed, and vise versa.
>
> The purpose to unset this flag is to make binary output can be
> parsed by blkparse.


Indeed. Note that you can overwrite this default by typing:

echo context-info > trace_option

after setting bkltrace as the current tracer.

Frederic.



>
> > The context information for the blktrace is present:
> >
> > rm-25433 [001] 9578.307485: 8,18 m N cfq25433 slice expired t=0
> > rm-25433 [001] 9578.307486: 8,18 m N cfq25433 put_queue
> > kjournald2-14059 [001] 9579.032457: 8,18 A W 5631242 + 8 <- (8,18) 3671312
> > kjournald2-14059 [001] 9579.032462: 8,18 Q W 5631242 + 8 [kjournald2]
> > kjournald2-14059 [001] 9579.032470: 8,18 G W 5631242 + 8 [kjournald2]
> >
> > It's just the context information for the event traces which are
> > missing.
> >
> > Anyway, if you could take a look at this, it would be much appreciated.
> > Usually I can interpolate the the timestamps from the surrounding
> > blktrace lines, but it would be nice if I didn't have to do that.
> >
>

2009-04-14 21:32:26

by Theodore Ts'o

[permalink] [raw]
Subject: Re: blktrace: event traces displayed wrong while ftrace blktrace is active

On Tue, Apr 14, 2009 at 11:25:33PM +0200, Frederic Weisbecker wrote:
>
> Indeed. Note that you can overwrite this default by typing:
>
> echo context-info > trace_option
>
> after setting bkltrace as the current tracer.

The problem with doing that is that blktrace will then display this:

fsync-tester-27934 [001] 208.031278: fsync-tester-27934 [001] 208.031278: 254,4 Q WS 268392 + 8 [fsync-tester]

I think the real problem is that blk trace is displaying the standard
context-info, so it's suppressing the normal context info. So I have
the choice on having no context information on my event trace lines,
or two copies of the context information for the block trace lines.

:-(

I'm guessing things were done this way in order to support the old
legancy blktrace format?

- Ted

2009-04-15 00:16:45

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: blktrace: event traces displayed wrong while ftrace blktrace is active

Em Tue, Apr 14, 2009 at 05:32:07PM -0400, Theodore Tso escreveu:
> On Tue, Apr 14, 2009 at 11:25:33PM +0200, Frederic Weisbecker wrote:
> >
> > Indeed. Note that you can overwrite this default by typing:
> >
> > echo context-info > trace_option
> >
> > after setting bkltrace as the current tracer.
>
> The problem with doing that is that blktrace will then display this:
>
> fsync-tester-27934 [001] 208.031278: fsync-tester-27934 [001] 208.031278: 254,4 Q WS 268392 + 8 [fsync-tester]
>
> I think the real problem is that blk trace is displaying the standard
> context-info, so it's suppressing the normal context info. So I have
> the choice on having no context information on my event trace lines,
> or two copies of the context information for the block trace lines.
>
> :-(
>
> I'm guessing things were done this way in order to support the old
> legancy blktrace format?

Well, last time I worked in this code the default was to use the default
context information, not duplicate it, and if one used:

echo blk_classic > /sys/kernel/debug/tracing/trace_options

Then it would look like the output of blktrace(8).

Lemme try to boot the current code...

- Arnaldo

2009-04-15 00:57:14

by Li Zefan

[permalink] [raw]
Subject: Re: blktrace: event traces displayed wrong while ftrace blktrace is active

Theodore Tso wrote:
> On Tue, Apr 14, 2009 at 11:25:33PM +0200, Frederic Weisbecker wrote:
>> Indeed. Note that you can overwrite this default by typing:
>>
>> echo context-info > trace_option
>>
>> after setting bkltrace as the current tracer.
>
> The problem with doing that is that blktrace will then display this:
>
> fsync-tester-27934 [001] 208.031278: fsync-tester-27934 [001] 208.031278: 254,4 Q WS 268392 + 8 [fsync-tester]
>
> I think the real problem is that blk trace is displaying the standard
> context-info, so it's suppressing the normal context info. So I have
> the choice on having no context information on my event trace lines,
> or two copies of the context information for the block trace lines.
>
> :-(
>

I'll provide a patch to solve this confusing behavior.

> I'm guessing things were done this way in order to support the old
> legancy blktrace format?
>

Right, in 2 ways:

1. we can make the output the same as userspace 'btrace /dev/sda':
# echo blk > current_tracer
# echo 1 > options/blk_classic

2. we can make the binary output parsable by userspace blkparse:
# echo blk > current_tracer
# echo bin > trace_options
# cat trace_pipe | blkparse -i -