I found the timestamp is wrong:
# echo bin > trace_option
# echo blk > current_tracer
# cat trace_pipe | blkparse -i -
8,0 0 0 0.000000000 504 A W ...
...
8,7 1 0 0.008534097 0 C R ...
(should be 8.534097xxx)
user-space blkparse expects the timestamp to be nanosecond.
Signed-off-by: Li Zefan <[email protected]>
---
kernel/trace/blktrace.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 6fb274f..ee7a8bb 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1168,7 +1168,7 @@ static int blk_trace_synthesize_old_trace(struct trace_iterator *iter)
const int offset = offsetof(struct blk_io_trace, sector);
struct blk_io_trace old = {
.magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION,
- .time = ns2usecs(iter->ts),
+ .time = iter->ts,
};
if (!trace_seq_putmem(s, &old, offset))
--
1.5.4.rc3
t1 t2
------ ------
do_blk_trace_setup() do_blk_trace_setup()
if (!blk_tree_root) {
if (!blk_tree_root)
blk_tree_root = create_dir()
blk_tree_root = create_dir();
(now blk_tree_root == NULL)
...
dir = create_dir(name, blk_tree_root);
Due to this race, t1 will create 'dir' in /debugfs but not /debugfs/block.
Signed-off-by: Li Zefan <[email protected]>
---
kernel/trace/blktrace.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index ee7a8bb..95f89fa 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -426,11 +426,15 @@ int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
ret = -ENOENT;
+ mutex_lock(&blk_tree_mutex);
if (!blk_tree_root) {
blk_tree_root = debugfs_create_dir("block", NULL);
- if (!blk_tree_root)
+ if (!blk_tree_root) {
+ mutex_unlock(&blk_tree_mutex);
goto err;
+ }
}
+ mutex_unlock(&blk_tree_mutex);
dir = debugfs_create_dir(buts->name, blk_tree_root);
--
1.5.4.rc3
Currently the original blktrace, which is using relay and is used via
ioctl, is broken. You can use ftrace to see the output of blktrace,
but user-space blktrace is unusable.
It's broken by "blktrace: add ftrace plugin"
(c71a896154119f4ca9e89d6078f5f63ad60ef199)
========
@@ -131,13 +162,14 @@ static void __blk_add_trace()
- if (unlikely(bt->trace_state != Blktrace_running))
+ if (unlikely(bt->trace_state != Blktrace_running || !blk_tracer_enabled))
return;
=========
With this patch, both ioctl and ftrace can be used, but of course you
can't use both of them at the same time.
Signed-off-by: Li Zefan <[email protected]>
---
kernel/trace/blktrace.c | 12 +++++-------
1 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 95f89fa..a7f7ff5 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -110,7 +110,7 @@ void __trace_note_message(struct blk_trace *bt, const char *fmt, ...)
unsigned long flags;
char *buf;
- if (blk_tr) {
+ if (blk_tracer_enabled) {
va_start(args, fmt);
ftrace_vprintk(fmt, args);
va_end(args);
@@ -169,7 +169,7 @@ static void __blk_add_trace(struct blk_trace *bt, sector_t sector, int bytes,
pid_t pid;
int cpu, pc = 0;
- if (unlikely(bt->trace_state != Blktrace_running ||
+ if (unlikely(bt->trace_state != Blktrace_running &&
!blk_tracer_enabled))
return;
@@ -185,7 +185,7 @@ static void __blk_add_trace(struct blk_trace *bt, sector_t sector, int bytes,
return;
cpu = raw_smp_processor_id();
- if (blk_tr) {
+ if (blk_tracer_enabled) {
tracing_record_cmdline(current);
pc = preempt_count();
@@ -235,7 +235,7 @@ record_it:
if (pdu_len)
memcpy((void *) t + sizeof(*t), pdu_data, pdu_len);
- if (blk_tr) {
+ if (blk_tracer_enabled) {
trace_buffer_unlock_commit(blk_tr, event, 0, pc);
return;
}
@@ -267,8 +267,7 @@ int blk_trace_remove(struct request_queue *q)
if (!bt)
return -EINVAL;
- if (bt->trace_state == Blktrace_setup ||
- bt->trace_state == Blktrace_stopped)
+ if (bt->trace_state != Blktrace_running)
blk_trace_cleanup(bt);
return 0;
@@ -1273,7 +1272,6 @@ static int blk_trace_setup_queue(struct request_queue *q, dev_t dev)
bt->dev = dev;
bt->act_mask = (u16)-1;
bt->end_lba = -1ULL;
- bt->trace_state = Blktrace_running;
old_bt = xchg(&q->blk_trace, bt);
if (old_bt != NULL) {
--
1.5.4.rc3
* Li Zefan <[email protected]> wrote:
> Currently the original blktrace, which is using relay and is used
> via ioctl, is broken. You can use ftrace to see the output of
> blktrace, but user-space blktrace is unusable.
good catch, thanks Li!
Jens, Arnaldo, do these fixes look good to you too?
Ingo
On Wed, Mar 25 2009, Ingo Molnar wrote:
>
> * Li Zefan <[email protected]> wrote:
>
> > Currently the original blktrace, which is using relay and is used
> > via ioctl, is broken. You can use ftrace to see the output of
> > blktrace, but user-space blktrace is unusable.
>
> good catch, thanks Li!
>
> Jens, Arnaldo, do these fixes look good to you too?
Look fine, but I'm very worried about the state of the current code. I
mean, this last round of fixes got the timestamp correct and made
blktrace work again. Those are something that should have been caught
even before the whole thing was posted, let alone merged. When do you
plan on pushing this upstream? Looks like 2.6.31 to me, it's clearly not
ready for 2.6.30 by any stretch.
--
Jens Axboe
* Jens Axboe <[email protected]> wrote:
> On Wed, Mar 25 2009, Ingo Molnar wrote:
> >
> > * Li Zefan <[email protected]> wrote:
> >
> > > Currently the original blktrace, which is using relay and is used
> > > via ioctl, is broken. You can use ftrace to see the output of
> > > blktrace, but user-space blktrace is unusable.
> >
> > good catch, thanks Li!
> >
> > Jens, Arnaldo, do these fixes look good to you too?
>
> Look fine, but I'm very worried about the state of the current
> code. I mean, this last round of fixes got the timestamp correct
> and made blktrace work again. [...]
Correct. I tested it on a 16way box (see the blkparse output below).
If you can see any sort of anomaly please let us know so we can fix
it.
> [...] Those are something that should have been caught even before
> the whole thing was posted, let alone merged. When do you plan on
> pushing this upstream? Looks like 2.6.31 to me, it's clearly not
> ready for 2.6.30 by any stretch.
Well, apparently nobody tried ioctl based blktrace+blkparse on -tip
or linux-next in the past ~1 month. The relayfs portion was (meant
to be) kept largely untouched but this bug still crept in.
Li started looking into it and found the bug. I tested the ftrace
plugin side regularly, but you are right that this bug took too long
to find (over a month) - from now on i'll check the ioctl side more
regularly too, for all tracing and relayfs changes as well.
Ingo
---------------->
8,0 13 1 0.000000000 6219 A R 20475199 + 128 <- (8,1) 20475136
8,0 13 2 0.000000721 6219 Q R 20475199 + 128 [cc1]
8,0 13 3 0.000008680 6219 G R 20475199 + 128 [cc1]
8,0 13 4 0.000010792 6219 P N [cc1]
8,0 13 5 0.000012401 6219 I R 20475199 + 128 [cc1]
8,0 13 6 0.000016097 6219 U N [cc1] 1
8,0 13 7 0.000023499 6219 D R 20475199 + 128 [cc1]
8,0 10 1 0.018127785 0 C R 20475199 + 128 [0]
8,0 10 2 0.347266011 1059 A W 20423 + 8 <- (8,1) 20360
8,0 10 3 0.347266724 1059 Q W 20423 + 8 [kjournald]
8,0 10 4 0.347271298 1059 G W 20423 + 8 [kjournald]
8,0 10 5 0.347273197 1059 P N [kjournald]
8,0 10 6 0.347274678 1059 I W 20423 + 8 [kjournald]
8,0 10 7 0.347315501 1059 A W 20431 + 8 <- (8,1) 20368
8,0 10 8 0.347315665 1059 Q W 20431 + 8 [kjournald]
[...]
8,0 10 17481 16.338721149 1059 M W 22047 + 8 [kjournald]
8,0 10 17482 16.338721545 1059 A W 22055 + 8 <- (8,1) 21992
8,0 10 17483 16.338721701 1059 Q W 22055 + 8 [kjournald]
8,0 10 17484 16.338721887 1059 M W 22055 + 8 [kjournald]
8,0 10 17485 16.338722389 1059 A W 22063 + 8 <- (8,1) 22000
8,0 10 17486 16.338722551 1059 Q W 22063 + 8 [kjournald]
8,0 10 17487 16.338722810 1059 M W 22063 + 8 [kjournald]
8,0 10 17488 16.338723586 1059 A W 22071 + 8 <- (8,1) 22008
8,0 10 17489 16.338723764 1059 Q W 22071 + 8 [kjournald]
8,0 10 17490 16.338723989 1059 M W 22071 + 8 [kjournald]
8,0 10 17491 16.338724547 1059 A W 22079 + 8 <- (8,1) 22016
8,0 10 17492 16.338724724 1059 Q W 22079 + 8 [kjournald]
8,0 10 17493 16.338724989 1059 M W 22079 + 8 [kjournald]
8,0 10 17494 16.338725489 1059 A W 22087 + 8 <- (8,1) 22024
8,0 10 17495 16.338725656 1059 Q W 22087 + 8 [kjournald]
8,0 10 17496 16.338725863 1059 M W 22087 + 8 [kjournald]
8,0 10 17497 16.338726383 1059 A W 22095 + 8 <- (8,1) 22032
8,0 10 17498 16.338726540 1059 Q W 22095 + 8 [kjournald]
8,0 10 17499 16.338726819 1059 M W 22095 + 8 [kjournald]
8,0 10 17500 16.338727300 1059 A W 22103 + 8 <- (8,1) 22040
8,0 10 17501 16.338727448 1059 Q W 22103 + 8 [kjournald]
8,0 10 17502 16.338727651 1059 M W 22103 + 8 [kjournald]
8,0 10 17503 16.338728145 1059 A W 22111 + 8 <- (8,1) 22048
8,0 10 17504 16.338728304 1059 Q W 22111 + 8 [kjournald]
8,0 10 17505 16.338728568 1059 M W 22111 + 8 [kjournald]
8,0 10 17506 16.338729100 1059 A W 22119 + 8 <- (8,1) 22056
8,0 10 17507 16.338729251 1059 Q W 22119 + 8 [kjournald]
8,0 10 17508 16.338729469 1059 M W 22119 + 8 [kjournald]
8,0 10 17509 16.338729980 1059 A W 22127 + 8 <- (8,1) 22064
8,0 10 17510 16.338730133 1059 Q W 22127 + 8 [kjournald]
8,0 10 17511 16.338730381 1059 M W 22127 + 8 [kjournald]
8,0 10 17512 16.338731606 1059 U N [kjournald] 1
8,0 10 17513 16.338740396 1059 D W 21679 + 456 [kjournald]
8,0 10 17514 16.341011279 7516 C W 21679 + 456 [0]
8,0 10 17515 16.341075110 1059 A W 22135 + 8 <- (8,1) 22072
8,0 10 17516 16.341075415 1059 Q W 22135 + 8 [kjournald]
8,0 10 17517 16.341077704 1059 G W 22135 + 8 [kjournald]
8,0 10 17518 16.341079074 1059 P N [kjournald]
8,0 10 17519 16.341079897 1059 I W 22135 + 8 [kjournald]
8,0 10 17520 16.341082865 1059 U N [kjournald] 1
8,0 10 17521 16.341088012 1059 D W 22135 + 8 [kjournald]
8,0 10 17522 16.341507489 7516 C W 22135 + 8 [0]
8,0 0 1 16.849527724 6225 U N [distccd] 0
CPU0 (sda):
Reads Queued: 0, 0KiB Writes Queued: 0, 0KiB
Read Dispatches: 0, 0KiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 1 Write depth: 32
IO unplugs: 1 Timer unplugs: 0
CPU1 (sda):
Reads Queued: 0, 0KiB Writes Queued: 0, 0KiB
Read Dispatches: 0, 0KiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 1 Write depth: 32
IO unplugs: 2 Timer unplugs: 0
CPU5 (sda):
Reads Queued: 0, 0KiB Writes Queued: 0, 0KiB
Read Dispatches: 0, 0KiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 1 Write depth: 32
IO unplugs: 1 Timer unplugs: 0
CPU6 (sda):
Reads Queued: 0, 0KiB Writes Queued: 0, 0KiB
Read Dispatches: 0, 0KiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 1 Write depth: 32
IO unplugs: 3 Timer unplugs: 0
CPU8 (sda):
Reads Queued: 0, 0KiB Writes Queued: 0, 0KiB
Read Dispatches: 0, 0KiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 1 Write depth: 32
IO unplugs: 8 Timer unplugs: 0
CPU9 (sda):
Reads Queued: 1, 4KiB Writes Queued: 0, 0KiB
Read Dispatches: 1, 4KiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 1 Write depth: 32
IO unplugs: 20 Timer unplugs: 0
CPU10 (sda):
Reads Queued: 1, 24KiB Writes Queued: 5,647, 22,588KiB
Read Dispatches: 1, 24KiB Write Dispatches: 160, 22,588KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 4, 108KiB Writes Completed: 184, 22,684KiB
Read Merges: 0, 0KiB Write Merges: 5,487, 21,948KiB
Read depth: 1 Write depth: 32
IO unplugs: 35 Timer unplugs: 0
CPU11 (sda):
Reads Queued: 0, 0KiB Writes Queued: 24, 96KiB
Read Dispatches: 0, 0KiB Write Dispatches: 24, 96KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 1 Write depth: 32
IO unplugs: 1 Timer unplugs: 0
CPU12 (sda):
Reads Queued: 0, 0KiB Writes Queued: 0, 0KiB
Read Dispatches: 0, 0KiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 1 Write depth: 32
IO unplugs: 3 Timer unplugs: 0
CPU13 (sda):
Reads Queued: 2, 80KiB Writes Queued: 0, 0KiB
Read Dispatches: 2, 80KiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 1 Write depth: 32
IO unplugs: 15 Timer unplugs: 0
CPU15 (sda):
Reads Queued: 0, 0KiB Writes Queued: 0, 0KiB
Read Dispatches: 0, 0KiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 0, 0KiB
Read depth: 1 Write depth: 32
IO unplugs: 3 Timer unplugs: 0
Total (sda):
Reads Queued: 4, 108KiB Writes Queued: 5,671, 22,684KiB
Read Dispatches: 4, 108KiB Write Dispatches: 184, 22,684KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 4, 108KiB Writes Completed: 184, 22,684KiB
Read Merges: 0, 0KiB Write Merges: 5,487, 21,948KiB
IO unplugs: 92 Timer unplugs: 0
Throughput (R/W): 6KiB/s / 1,346KiB/s
Events (sda): 17,723 entries
Skips: 0 forward (0 - 0.0%)
Input file sda.blktrace.0 added
Input file sda.blktrace.1 added
Input file sda.blktrace.5 added
Input file sda.blktrace.6 added
Input file sda.blktrace.8 added
Input file sda.blktrace.9 added
Input file sda.blktrace.10 added
Input file sda.blktrace.11 added
Input file sda.blktrace.12 added
Input file sda.blktrace.13 added
Input file sda.blktrace.15 added
Commit-ID: d8708d61b1d4e70f2dd2ff0896bded28a7ca0ea4
Gitweb: http://git.kernel.org/tip/d8708d61b1d4e70f2dd2ff0896bded28a7ca0ea4
Author: Li Zefan <[email protected]>
AuthorDate: Wed, 25 Mar 2009 17:18:56 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 25 Mar 2009 12:05:38 +0100
blktrace: fix timestamp in binary output
I found the timestamp is wrong:
# echo bin > trace_option
# echo blk > current_tracer
# cat trace_pipe | blkparse -i -
8,0 0 0 0.000000000 504 A W ...
...
8,7 1 0 0.008534097 0 C R ...
(should be 8.534097xxx)
user-space blkparse expects the timestamp to be nanosecond.
Signed-off-by: Li Zefan <[email protected]>
Acked-by: Jens Axboe <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/trace/blktrace.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 6fb274f..ee7a8bb 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1168,7 +1168,7 @@ static int blk_trace_synthesize_old_trace(struct trace_iterator *iter)
const int offset = offsetof(struct blk_io_trace, sector);
struct blk_io_trace old = {
.magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION,
- .time = ns2usecs(iter->ts),
+ .time = iter->ts,
};
if (!trace_seq_putmem(s, &old, offset))
Commit-ID: a216c4acfafb28d697fe07b4137ab780131e4d60
Gitweb: http://git.kernel.org/tip/a216c4acfafb28d697fe07b4137ab780131e4d60
Author: Li Zefan <[email protected]>
AuthorDate: Wed, 25 Mar 2009 17:19:33 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 25 Mar 2009 12:05:39 +0100
blktrace: fix a race when creating blk_tree_root in debugfs
t1 t2
------ ------
do_blk_trace_setup() do_blk_trace_setup()
if (!blk_tree_root) {
if (!blk_tree_root)
blk_tree_root = create_dir()
blk_tree_root = create_dir();
(now blk_tree_root == NULL)
...
dir = create_dir(name, blk_tree_root);
Due to this race, t1 will create 'dir' in /debugfs but not /debugfs/block.
Signed-off-by: Li Zefan <[email protected]>
Acked-by: Jens Axboe <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/trace/blktrace.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index ee7a8bb..95f89fa 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -426,11 +426,15 @@ int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
ret = -ENOENT;
+ mutex_lock(&blk_tree_mutex);
if (!blk_tree_root) {
blk_tree_root = debugfs_create_dir("block", NULL);
- if (!blk_tree_root)
+ if (!blk_tree_root) {
+ mutex_unlock(&blk_tree_mutex);
goto err;
+ }
}
+ mutex_unlock(&blk_tree_mutex);
dir = debugfs_create_dir(buts->name, blk_tree_root);
Commit-ID: 29d661309424188942aeea9edcd362c1bd7414ab
Gitweb: http://git.kernel.org/tip/29d661309424188942aeea9edcd362c1bd7414ab
Author: Li Zefan <[email protected]>
AuthorDate: Wed, 25 Mar 2009 17:21:26 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 25 Mar 2009 12:05:39 +0100
blktrace: fix the original blktrace
Currently the original blktrace, which is using relay and is used via
ioctl, is broken. You can use ftrace to see the output of blktrace,
but user-space blktrace is unusable.
It's broken by "blktrace: add ftrace plugin"
(c71a896154119f4ca9e89d6078f5f63ad60ef199)
- if (unlikely(bt->trace_state != Blktrace_running))
+ if (unlikely(bt->trace_state != Blktrace_running || !blk_tracer_enabled))
return;
With this patch, both ioctl and ftrace can be used, but of course you
can't use both of them at the same time.
Signed-off-by: Li Zefan <[email protected]>
Acked-by: Jens Axboe <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/trace/blktrace.c | 12 +++++-------
1 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 95f89fa..a7f7ff5 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -110,7 +110,7 @@ void __trace_note_message(struct blk_trace *bt, const char *fmt, ...)
unsigned long flags;
char *buf;
- if (blk_tr) {
+ if (blk_tracer_enabled) {
va_start(args, fmt);
ftrace_vprintk(fmt, args);
va_end(args);
@@ -169,7 +169,7 @@ static void __blk_add_trace(struct blk_trace *bt, sector_t sector, int bytes,
pid_t pid;
int cpu, pc = 0;
- if (unlikely(bt->trace_state != Blktrace_running ||
+ if (unlikely(bt->trace_state != Blktrace_running &&
!blk_tracer_enabled))
return;
@@ -185,7 +185,7 @@ static void __blk_add_trace(struct blk_trace *bt, sector_t sector, int bytes,
return;
cpu = raw_smp_processor_id();
- if (blk_tr) {
+ if (blk_tracer_enabled) {
tracing_record_cmdline(current);
pc = preempt_count();
@@ -235,7 +235,7 @@ record_it:
if (pdu_len)
memcpy((void *) t + sizeof(*t), pdu_data, pdu_len);
- if (blk_tr) {
+ if (blk_tracer_enabled) {
trace_buffer_unlock_commit(blk_tr, event, 0, pc);
return;
}
@@ -267,8 +267,7 @@ int blk_trace_remove(struct request_queue *q)
if (!bt)
return -EINVAL;
- if (bt->trace_state == Blktrace_setup ||
- bt->trace_state == Blktrace_stopped)
+ if (bt->trace_state != Blktrace_running)
blk_trace_cleanup(bt);
return 0;
@@ -1273,7 +1272,6 @@ static int blk_trace_setup_queue(struct request_queue *q, dev_t dev)
bt->dev = dev;
bt->act_mask = (u16)-1;
bt->end_lba = -1ULL;
- bt->trace_state = Blktrace_running;
old_bt = xchg(&q->blk_trace, bt);
if (old_bt != NULL) {
Em Wed, Mar 25, 2009 at 12:47:20PM +0100, Ingo Molnar escreveu:
>
> * Jens Axboe <[email protected]> wrote:
>
> > On Wed, Mar 25 2009, Ingo Molnar wrote:
> > >
> > > * Li Zefan <[email protected]> wrote:
> > >
> > > > Currently the original blktrace, which is using relay and is used
> > > > via ioctl, is broken. You can use ftrace to see the output of
> > > > blktrace, but user-space blktrace is unusable.
> > >
> > > good catch, thanks Li!
> > >
> > > Jens, Arnaldo, do these fixes look good to you too?
> >
> > Look fine, but I'm very worried about the state of the current
> > code. I mean, this last round of fixes got the timestamp correct
> > and made blktrace work again. [...]
>
> Correct. I tested it on a 16way box (see the blkparse output below).
> If you can see any sort of anomaly please let us know so we can fix
> it.
>
> > [...] Those are something that should have been caught even before
> > the whole thing was posted, let alone merged. When do you plan on
> > pushing this upstream? Looks like 2.6.31 to me, it's clearly not
> > ready for 2.6.30 by any stretch.
>
> Well, apparently nobody tried ioctl based blktrace+blkparse on -tip
> or linux-next in the past ~1 month. The relayfs portion was (meant
> to be) kept largely untouched but this bug still crept in.
>
> Li started looking into it and found the bug. I tested the ftrace
> plugin side regularly, but you are right that this bug took too long
> to find (over a month) - from now on i'll check the ioctl side more
> regularly too, for all tracing and relayfs changes as well.
I got sidetracked by other stuff and didn't managed to continue working
on it, but at the same time testing by somebody else is needed. There is
no bug-free code, let alone in the first iteration, even less for
something I wasn't familiar with and depended on people like Li to do a
careful review.
Li fixed bugs and continued the work, getting things I wanted working
and made the initial steps to get done, like being able to just fed
trace_pipe into blkparse and get the same result as btrace.
So I apologise for not having continued working on it more often to
catch myself the bugs I introduced, and give a big thank you to Li for
the thorough review, bugfixes and improvements.
- Arnaldo
On Wed, Mar 25 2009, Arnaldo Carvalho de Melo wrote:
> Em Wed, Mar 25, 2009 at 12:47:20PM +0100, Ingo Molnar escreveu:
> >
> > * Jens Axboe <[email protected]> wrote:
> >
> > > On Wed, Mar 25 2009, Ingo Molnar wrote:
> > > >
> > > > * Li Zefan <[email protected]> wrote:
> > > >
> > > > > Currently the original blktrace, which is using relay and is used
> > > > > via ioctl, is broken. You can use ftrace to see the output of
> > > > > blktrace, but user-space blktrace is unusable.
> > > >
> > > > good catch, thanks Li!
> > > >
> > > > Jens, Arnaldo, do these fixes look good to you too?
> > >
> > > Look fine, but I'm very worried about the state of the current
> > > code. I mean, this last round of fixes got the timestamp correct
> > > and made blktrace work again. [...]
> >
> > Correct. I tested it on a 16way box (see the blkparse output below).
> > If you can see any sort of anomaly please let us know so we can fix
> > it.
> >
> > > [...] Those are something that should have been caught even before
> > > the whole thing was posted, let alone merged. When do you plan on
> > > pushing this upstream? Looks like 2.6.31 to me, it's clearly not
> > > ready for 2.6.30 by any stretch.
> >
> > Well, apparently nobody tried ioctl based blktrace+blkparse on -tip
> > or linux-next in the past ~1 month. The relayfs portion was (meant
> > to be) kept largely untouched but this bug still crept in.
> >
> > Li started looking into it and found the bug. I tested the ftrace
> > plugin side regularly, but you are right that this bug took too long
> > to find (over a month) - from now on i'll check the ioctl side more
> > regularly too, for all tracing and relayfs changes as well.
>
> I got sidetracked by other stuff and didn't managed to continue working
> on it, but at the same time testing by somebody else is needed. There is
> no bug-free code, let alone in the first iteration, even less for
> something I wasn't familiar with and depended on people like Li to do a
> careful review.
>
> Li fixed bugs and continued the work, getting things I wanted working
> and made the initial steps to get done, like being able to just fed
> trace_pipe into blkparse and get the same result as btrace.
>
> So I apologise for not having continued working on it more often to
> catch myself the bugs I introduced, and give a big thank you to Li for
> the thorough review, bugfixes and improvements.
Dont apologise, there's nothing wrong with that. The only thing wrong
would be for the unfinished code to get merged, when it clearly wasn't
ready.
--
Jens Axboe
Em Wed, Mar 25, 2009 at 02:56:39PM +0100, Jens Axboe escreveu:
> On Wed, Mar 25 2009, Arnaldo Carvalho de Melo wrote:
> > Em Wed, Mar 25, 2009 at 12:47:20PM +0100, Ingo Molnar escreveu:
> > >
> > > * Jens Axboe <[email protected]> wrote:
> > >
> > > > On Wed, Mar 25 2009, Ingo Molnar wrote:
> > > > >
> > > > > * Li Zefan <[email protected]> wrote:
> > > > >
> > > > > > Currently the original blktrace, which is using relay and is used
> > > > > > via ioctl, is broken. You can use ftrace to see the output of
> > > > > > blktrace, but user-space blktrace is unusable.
> > > > >
> > > > > good catch, thanks Li!
> > > > >
> > > > > Jens, Arnaldo, do these fixes look good to you too?
> > > >
> > > > Look fine, but I'm very worried about the state of the current
> > > > code. I mean, this last round of fixes got the timestamp correct
> > > > and made blktrace work again. [...]
> > >
> > > Correct. I tested it on a 16way box (see the blkparse output below).
> > > If you can see any sort of anomaly please let us know so we can fix
> > > it.
> > >
> > > > [...] Those are something that should have been caught even before
> > > > the whole thing was posted, let alone merged. When do you plan on
> > > > pushing this upstream? Looks like 2.6.31 to me, it's clearly not
> > > > ready for 2.6.30 by any stretch.
> > >
> > > Well, apparently nobody tried ioctl based blktrace+blkparse on -tip
> > > or linux-next in the past ~1 month. The relayfs portion was (meant
> > > to be) kept largely untouched but this bug still crept in.
> > >
> > > Li started looking into it and found the bug. I tested the ftrace
> > > plugin side regularly, but you are right that this bug took too long
> > > to find (over a month) - from now on i'll check the ioctl side more
> > > regularly too, for all tracing and relayfs changes as well.
> >
> > I got sidetracked by other stuff and didn't managed to continue working
> > on it, but at the same time testing by somebody else is needed. There is
> > no bug-free code, let alone in the first iteration, even less for
> > something I wasn't familiar with and depended on people like Li to do a
> > careful review.
> >
> > Li fixed bugs and continued the work, getting things I wanted working
> > and made the initial steps to get done, like being able to just fed
> > trace_pipe into blkparse and get the same result as btrace.
> >
> > So I apologise for not having continued working on it more often to
> > catch myself the bugs I introduced, and give a big thank you to Li for
> > the thorough review, bugfixes and improvements.
>
> Dont apologise, there's nothing wrong with that. The only thing wrong
> would be for the unfinished code to get merged, when it clearly wasn't
> ready.
So in this case I guess the right approach is to keep this on a -tip
branch and delay it getting merged to .31?
- Arnaldo
Jens Axboe wrote:
> On Wed, Mar 25 2009, Ingo Molnar wrote:
>> * Li Zefan <[email protected]> wrote:
>>
>>> Currently the original blktrace, which is using relay and is used
>>> via ioctl, is broken. You can use ftrace to see the output of
>>> blktrace, but user-space blktrace is unusable.
>> good catch, thanks Li!
>>
>> Jens, Arnaldo, do these fixes look good to you too?
>
> Look fine, but I'm very worried about the state of the current code. I
> mean, this last round of fixes got the timestamp correct and made
> blktrace work again. Those are something that should have been caught
> even before the whole thing was posted, let alone merged. When do you
> plan on pushing this upstream? Looks like 2.6.31 to me, it's clearly not
> ready for 2.6.30 by any stretch.
>
Actually not all the bugs I found are introduced by "blktrace: ftrace plugin"
in -tip tree.
One is introduced by "block: get rid of the manual directory counting in blktrace"
(f48fc4d32e24c0b6a18aad30305d819bcc68c049). Two are "blktrace: port to tracepoints"
(5f3ea37c7716db4e894a480e0c18b24399595b6b). Both commits are in mainline.
Since 2 of the bugs will rarely happen in real-life, and the 3rd one is a small
issue, and we were so close to the release of .29, so I sent the fixes for -tip
tree but not mainline. But if we are going merge tip/blktrace to .31, I guess
it's better to merge that 3 fixes to .30?
On Thu, Mar 26 2009, Li Zefan wrote:
> Jens Axboe wrote:
> > On Wed, Mar 25 2009, Ingo Molnar wrote:
> >> * Li Zefan <[email protected]> wrote:
> >>
> >>> Currently the original blktrace, which is using relay and is used
> >>> via ioctl, is broken. You can use ftrace to see the output of
> >>> blktrace, but user-space blktrace is unusable.
> >> good catch, thanks Li!
> >>
> >> Jens, Arnaldo, do these fixes look good to you too?
> >
> > Look fine, but I'm very worried about the state of the current code. I
> > mean, this last round of fixes got the timestamp correct and made
> > blktrace work again. Those are something that should have been caught
> > even before the whole thing was posted, let alone merged. When do you
> > plan on pushing this upstream? Looks like 2.6.31 to me, it's clearly not
> > ready for 2.6.30 by any stretch.
> >
>
> Actually not all the bugs I found are introduced by "blktrace: ftrace plugin"
> in -tip tree.
Oh sure, I did see that! And the fixes are appreciated of course.
> One is introduced by "block: get rid of the manual directory counting in blktrace"
> (f48fc4d32e24c0b6a18aad30305d819bcc68c049). Two are "blktrace: port to tracepoints"
> (5f3ea37c7716db4e894a480e0c18b24399595b6b). Both commits are in mainline.
>
> Since 2 of the bugs will rarely happen in real-life, and the 3rd one
> is a small issue, and we were so close to the release of .29, so I
> sent the fixes for -tip tree but not mainline. But if we are going
> merge tip/blktrace to .31, I guess it's better to merge that 3 fixes
> to .30?
Since you are the person that worked on it most lately, your opinion
matters the most. What do you think, is it ready for 2.6.30 or should it
wait for .31?
--
Jens Axboe
* Jens Axboe <[email protected]> wrote:
> > One is introduced by "block: get rid of the manual directory counting in blktrace"
> > (f48fc4d32e24c0b6a18aad30305d819bcc68c049). Two are "blktrace: port to tracepoints"
> > (5f3ea37c7716db4e894a480e0c18b24399595b6b). Both commits are in mainline.
> >
> > Since 2 of the bugs will rarely happen in real-life, and the 3rd
> > one is a small issue, and we were so close to the release of
> > .29, so I sent the fixes for -tip tree but not mainline. But if
> > we are going merge tip/blktrace to .31, I guess it's better to
> > merge that 3 fixes to .30?
>
> Since you are the person that worked on it most lately, your
> opinion matters the most. What do you think, is it ready for
> 2.6.30 or should it wait for .31?
Yeah. Li, Arnaldo, what do you think?
Delaying them would be quite painful at this stage though - the
blktrace plugin conversion was done with (ahem) your initial support
so the commits got (foolishly, in hindsight ;-) interwoven into 300
commits of the 2.6.30 tracing tree.
Delaying them would also be technically baseless - there are no
known regressions or bugs in this code. (If you know about bugs then
please speak up so we can fix them! ;-)
At this last minute stage we can do two things: merge it now or if
you NAK it then we'll rebase the last ~2 months of the tracing tree
with hundreds of commits (sigh), destroy its true history in the
process and eradicate the blktrace bits.
I'd like to avoid the second option if possible as it destroys real
value (these changes are really nice improvements, a lot of work
went into them and there's no open regressions so i can see no
objective reason why they couldnt go upstream now) but it's your
choice really, you maintain block/* :-)
Ingo
Em Thu, Mar 26, 2009 at 02:37:32PM +0100, Ingo Molnar escreveu:
>
> * Jens Axboe <[email protected]> wrote:
>
> > > One is introduced by "block: get rid of the manual directory counting in blktrace"
> > > (f48fc4d32e24c0b6a18aad30305d819bcc68c049). Two are "blktrace: port to tracepoints"
> > > (5f3ea37c7716db4e894a480e0c18b24399595b6b). Both commits are in mainline.
> > >
> > > Since 2 of the bugs will rarely happen in real-life, and the 3rd
> > > one is a small issue, and we were so close to the release of
> > > .29, so I sent the fixes for -tip tree but not mainline. But if
> > > we are going merge tip/blktrace to .31, I guess it's better to
> > > merge that 3 fixes to .30?
> >
> > Since you are the person that worked on it most lately, your
> > opinion matters the most. What do you think, is it ready for
> > 2.6.30 or should it wait for .31?
>
> Yeah. Li, Arnaldo, what do you think?
>
> Delaying them would be quite painful at this stage though - the
> blktrace plugin conversion was done with (ahem) your initial support
> so the commits got (foolishly, in hindsight ;-) interwoven into 300
> commits of the 2.6.30 tracing tree.
>
> Delaying them would also be technically baseless - there are no
> known regressions or bugs in this code. (If you know about bugs then
> please speak up so we can fix them! ;-)
>
> At this last minute stage we can do two things: merge it now or if
> you NAK it then we'll rebase the last ~2 months of the tracing tree
> with hundreds of commits (sigh), destroy its true history in the
> process and eradicate the blktrace bits.
>
> I'd like to avoid the second option if possible as it destroys real
> value (these changes are really nice improvements, a lot of work
> went into them and there's no open regressions so i can see no
> objective reason why they couldnt go upstream now) but it's your
> choice really, you maintain block/* :-)
Well, after this set of fixes by Li the only problem I'm aware of is the
__trace_note_message, that is using ftrace_vprintk, that I didn't notice
because I wasn't using CFQ when developing it, and that gets the output
of the _ftrace_ plugin wedged, but that doesn't affect normal blktrace
operation.
I'll try to get that fixed somehow today, other than that I'm not aware
of any other problem, so I think it could get into 2.6.30 on the premise
that normal blktrace operation is as stable as before and that the
ftrace plugin is recent work and may still need some fixes.
- Arnaldo
On Thu, Mar 26 2009, Arnaldo Carvalho de Melo wrote:
> Em Thu, Mar 26, 2009 at 02:37:32PM +0100, Ingo Molnar escreveu:
> >
> > * Jens Axboe <[email protected]> wrote:
> >
> > > > One is introduced by "block: get rid of the manual directory counting in blktrace"
> > > > (f48fc4d32e24c0b6a18aad30305d819bcc68c049). Two are "blktrace: port to tracepoints"
> > > > (5f3ea37c7716db4e894a480e0c18b24399595b6b). Both commits are in mainline.
> > > >
> > > > Since 2 of the bugs will rarely happen in real-life, and the 3rd
> > > > one is a small issue, and we were so close to the release of
> > > > .29, so I sent the fixes for -tip tree but not mainline. But if
> > > > we are going merge tip/blktrace to .31, I guess it's better to
> > > > merge that 3 fixes to .30?
> > >
> > > Since you are the person that worked on it most lately, your
> > > opinion matters the most. What do you think, is it ready for
> > > 2.6.30 or should it wait for .31?
> >
> > Yeah. Li, Arnaldo, what do you think?
> >
> > Delaying them would be quite painful at this stage though - the
> > blktrace plugin conversion was done with (ahem) your initial support
> > so the commits got (foolishly, in hindsight ;-) interwoven into 300
> > commits of the 2.6.30 tracing tree.
> >
> > Delaying them would also be technically baseless - there are no
> > known regressions or bugs in this code. (If you know about bugs then
> > please speak up so we can fix them! ;-)
> >
> > At this last minute stage we can do two things: merge it now or if
> > you NAK it then we'll rebase the last ~2 months of the tracing tree
> > with hundreds of commits (sigh), destroy its true history in the
> > process and eradicate the blktrace bits.
> >
> > I'd like to avoid the second option if possible as it destroys real
> > value (these changes are really nice improvements, a lot of work
> > went into them and there's no open regressions so i can see no
> > objective reason why they couldnt go upstream now) but it's your
> > choice really, you maintain block/* :-)
>
> Well, after this set of fixes by Li the only problem I'm aware of is the
> __trace_note_message, that is using ftrace_vprintk, that I didn't notice
> because I wasn't using CFQ when developing it, and that gets the output
> of the _ftrace_ plugin wedged, but that doesn't affect normal blktrace
> operation.
>
> I'll try to get that fixed somehow today, other than that I'm not aware
> of any other problem, so I think it could get into 2.6.30 on the premise
> that normal blktrace operation is as stable as before and that the
> ftrace plugin is recent work and may still need some fixes.
Well, judging whether it is as stable as before is exactly what is asked
of you and Li :-). So which is it?
--
Jens Axboe
Em Thu, Mar 26, 2009 at 04:44:31PM +0100, Jens Axboe escreveu:
> On Thu, Mar 26 2009, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Mar 26, 2009 at 02:37:32PM +0100, Ingo Molnar escreveu:
> > >
> > > * Jens Axboe <[email protected]> wrote:
> > >
> > > > > One is introduced by "block: get rid of the manual directory counting in blktrace"
> > > > > (f48fc4d32e24c0b6a18aad30305d819bcc68c049). Two are "blktrace: port to tracepoints"
> > > > > (5f3ea37c7716db4e894a480e0c18b24399595b6b). Both commits are in mainline.
> > > > >
> > > > > Since 2 of the bugs will rarely happen in real-life, and the 3rd
> > > > > one is a small issue, and we were so close to the release of
> > > > > .29, so I sent the fixes for -tip tree but not mainline. But if
> > > > > we are going merge tip/blktrace to .31, I guess it's better to
> > > > > merge that 3 fixes to .30?
> > > >
> > > > Since you are the person that worked on it most lately, your
> > > > opinion matters the most. What do you think, is it ready for
> > > > 2.6.30 or should it wait for .31?
> > >
> > > Yeah. Li, Arnaldo, what do you think?
> > >
> > > Delaying them would be quite painful at this stage though - the
> > > blktrace plugin conversion was done with (ahem) your initial support
> > > so the commits got (foolishly, in hindsight ;-) interwoven into 300
> > > commits of the 2.6.30 tracing tree.
> > >
> > > Delaying them would also be technically baseless - there are no
> > > known regressions or bugs in this code. (If you know about bugs then
> > > please speak up so we can fix them! ;-)
> > >
> > > At this last minute stage we can do two things: merge it now or if
> > > you NAK it then we'll rebase the last ~2 months of the tracing tree
> > > with hundreds of commits (sigh), destroy its true history in the
> > > process and eradicate the blktrace bits.
> > >
> > > I'd like to avoid the second option if possible as it destroys real
> > > value (these changes are really nice improvements, a lot of work
> > > went into them and there's no open regressions so i can see no
> > > objective reason why they couldnt go upstream now) but it's your
> > > choice really, you maintain block/* :-)
> >
> > Well, after this set of fixes by Li the only problem I'm aware of is the
> > __trace_note_message, that is using ftrace_vprintk, that I didn't notice
> > because I wasn't using CFQ when developing it, and that gets the output
> > of the _ftrace_ plugin wedged, but that doesn't affect normal blktrace
> > operation.
> >
> > I'll try to get that fixed somehow today, other than that I'm not aware
> > of any other problem, so I think it could get into 2.6.30 on the premise
> > that normal blktrace operation is as stable as before and that the
> > ftrace plugin is recent work and may still need some fixes.
>
> Well, judging whether it is as stable as before is exactly what is asked
> of you and Li :-). So which is it?
I just checked out the latest linux-tip tree and will do some tests,
checking the work that Li did and will try various sequences such as
first using ftrace then blktrace, the reverse, trying to do both, trying
to parse the trace_pipe with blkparse -, etc on a 4way Xeon machine and
will report here the results.
- Arnaldo
>>> Yeah. Li, Arnaldo, what do you think?
>>>
>>> Delaying them would be quite painful at this stage though - the
>>> blktrace plugin conversion was done with (ahem) your initial support
>>> so the commits got (foolishly, in hindsight ;-) interwoven into 300
>>> commits of the 2.6.30 tracing tree.
>>>
>>> Delaying them would also be technically baseless - there are no
>>> known regressions or bugs in this code. (If you know about bugs then
>>> please speak up so we can fix them! ;-)
>>>
I do know some bugs. ;)
>>> At this last minute stage we can do two things: merge it now or if
>>> you NAK it then we'll rebase the last ~2 months of the tracing tree
>>> with hundreds of commits (sigh), destroy its true history in the
>>> process and eradicate the blktrace bits.
>>>
>>> I'd like to avoid the second option if possible as it destroys real
>>> value (these changes are really nice improvements, a lot of work
>>> went into them and there's no open regressions so i can see no
>>> objective reason why they couldnt go upstream now) but it's your
>>> choice really, you maintain block/* :-)
>> Well, after this set of fixes by Li the only problem I'm aware of is the
>> __trace_note_message, that is using ftrace_vprintk, that I didn't notice
This was the first issue I saw when I started to use blktrace in -tip tree,
and I made a fix yesterday. I'll post it soon, with some other fixes.
>> because I wasn't using CFQ when developing it, and that gets the output
>> of the _ftrace_ plugin wedged, but that doesn't affect normal blktrace
>> operation.
>>
>> I'll try to get that fixed somehow today, other than that I'm not aware
>> of any other problem, so I think it could get into 2.6.30 on the premise
>> that normal blktrace operation is as stable as before and that the
>> ftrace plugin is recent work and may still need some fixes.
>
> Well, judging whether it is as stable as before is exactly what is asked
> of you and Li :-). So which is it?
>
With the fixes I posted and will post, both ioctl-based blktrace and ftrace
-based blktrace should be working fine, and I'll continue to use and test
it.