2009-03-24 08:04:12

by Li Zefan

[permalink] [raw]
Subject: [PATCH 0/5] blktrace: various cleanups and fixes, part 2

The first part is here:
http://lkml.org/lkml/2009/3/19/475

And this is the 2nd part, and I still have some pending fixes..

Each patch is a small one, except the last one, which is still not big.

[PATCH 1/5] blktrace: mark ddir_act and what2act const
[PATCH 2/5] blktrace: fix wrong calculation of RWBS
[PATCH 3/5] blktrace: fix off-by-one bug
[PATCH 4/5] blktrace: fix t_error()
[PATCH 5/5] blktrace: print human-readable act_mask

blktrace.c | 127 ++++++++++++++++++++++++++++++++++++-------------------------
1 file changed, 77 insertions(+), 50 deletions(-)

Signed-off-by: Li Zefan <[email protected]>


2009-03-24 08:04:40

by Li Zefan

[permalink] [raw]
Subject: [PATCH 1/5] blktrace: mark ddir_act and what2act const

ddir_act and what2act are always stay immutable.

Signed-off-by: Li Zefan <[email protected]>
---
kernel/trace/blktrace.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 108f4f7..1ffcbd4 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -147,8 +147,8 @@ static int act_log_check(struct blk_trace *bt, u32 what, sector_t sector,
/*
* Data direction bit lookup
*/
-static u32 ddir_act[2] __read_mostly = { BLK_TC_ACT(BLK_TC_READ),
- BLK_TC_ACT(BLK_TC_WRITE) };
+static const u32 ddir_act[2] = { BLK_TC_ACT(BLK_TC_READ),
+ BLK_TC_ACT(BLK_TC_WRITE) };

/* The ilog2() calls fall out because they're constant */
#define MASK_TC_BIT(rw, __name) ((rw & (1 << BIO_RW_ ## __name)) << \
@@ -1116,10 +1116,10 @@ static void blk_tracer_reset(struct trace_array *tr)
blk_tracer_stop(tr);
}

-static struct {
+static const struct {
const char *act[2];
int (*print)(struct trace_seq *s, const struct trace_entry *ent);
-} what2act[] __read_mostly = {
+} what2act[] = {
[__BLK_TA_QUEUE] = {{ "Q", "queue" }, blk_log_generic },
[__BLK_TA_BACKMERGE] = {{ "M", "backmerge" }, blk_log_generic },
[__BLK_TA_FRONTMERGE] = {{ "F", "frontmerge" }, blk_log_generic },
--
1.5.4.rc3

2009-03-24 08:05:10

by Li Zefan

[permalink] [raw]
Subject: [PATCH 2/5] blktrace: fix wrong calculation of RWBS

Trace categories are the upper 16 bits, not the lower 16 bits.

Signed-off-by: Li Zefan <[email protected]>
---
kernel/trace/blktrace.c | 13 +++++++------
1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 1ffcbd4..9af4143 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -922,23 +922,24 @@ static void blk_unregister_tracepoints(void)
static void fill_rwbs(char *rwbs, const struct blk_io_trace *t)
{
int i = 0;
+ int tc = t->action >> BLK_TC_SHIFT;

- if (t->action & BLK_TC_DISCARD)
+ if (tc & BLK_TC_DISCARD)
rwbs[i++] = 'D';
- else if (t->action & BLK_TC_WRITE)
+ else if (tc & BLK_TC_WRITE)
rwbs[i++] = 'W';
else if (t->bytes)
rwbs[i++] = 'R';
else
rwbs[i++] = 'N';

- if (t->action & BLK_TC_AHEAD)
+ if (tc & BLK_TC_AHEAD)
rwbs[i++] = 'A';
- if (t->action & BLK_TC_BARRIER)
+ if (tc & BLK_TC_BARRIER)
rwbs[i++] = 'B';
- if (t->action & BLK_TC_SYNC)
+ if (tc & BLK_TC_SYNC)
rwbs[i++] = 'S';
- if (t->action & BLK_TC_META)
+ if (tc & BLK_TC_META)
rwbs[i++] = 'M';

rwbs[i] = '\0';
--
1.5.4.rc3

2009-03-24 08:05:52

by Li Zefan

[permalink] [raw]
Subject: [PATCH 3/5] blktrace: fix off-by-one bug

'what' is used as the index of array what2act, so it can't >= the array size.

Signed-off-by: Li Zefan <[email protected]>
---
kernel/trace/blktrace.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 9af4143..0e91caa 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1149,7 +1149,7 @@ static enum print_line_t blk_trace_event_print(struct trace_iterator *iter,
if (!trace_print_context(iter))
return TRACE_TYPE_PARTIAL_LINE;

- if (unlikely(what == 0 || what > ARRAY_SIZE(what2act)))
+ if (unlikely(what == 0 || what >= ARRAY_SIZE(what2act)))
ret = trace_seq_printf(s, "Bad pc action %x\n", what);
else {
const bool long_act = !!(trace_flags & TRACE_ITER_VERBOSE);
@@ -1196,7 +1196,7 @@ static enum print_line_t blk_tracer_print_line(struct trace_iterator *iter)
t = (const struct blk_io_trace *)iter->ent;
what = t->action & ((1 << BLK_TC_SHIFT) - 1);

- if (unlikely(what == 0 || what > ARRAY_SIZE(what2act)))
+ if (unlikely(what == 0 || what >= ARRAY_SIZE(what2act)))
ret = trace_seq_printf(&iter->seq, "Bad pc action %x\n", what);
else {
const bool long_act = !!(trace_flags & TRACE_ITER_VERBOSE);
--
1.5.4.rc3

2009-03-24 08:06:21

by Li Zefan

[permalink] [raw]
Subject: [PATCH 4/5] blktrace: fix t_error()

t_error() should return t->error but not t->sector.

Signed-off-by: Li Zefan <[email protected]>
---
kernel/trace/blktrace.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 0e91caa..f33c176 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -968,7 +968,7 @@ static inline unsigned long long t_sector(const struct trace_entry *ent)

static inline __u16 t_error(const struct trace_entry *ent)
{
- return te_blk_io_trace(ent)->sector;
+ return te_blk_io_trace(ent)->error;
}

static __u64 get_pdu_int(const struct trace_entry *ent)
--
1.5.4.rc3

2009-03-24 08:06:46

by Li Zefan

[permalink] [raw]
Subject: [PATCH 5/5] blktrace: print human-readable act_mask

Print stringified act_mask instead of hex value:
# cat act_mask
read,write,barrier,sync,queue,requeue,issue,complete,fs,pc,ahead,meta,
discard,drv_data
# echo "meta,write" > act_mask
# cat act_mask
write,meta

Also:
- make act_mask accept "ahead", "meta", "discard" and "drv_data"
- use strsep() instead of strchr() to parse user input
- return -EINVAL if a token is not found in the mask map
- propagate error value of blk_trace_mask2str() to userspace, but not
always return -ENXIO.

Signed-off-by: Li Zefan <[email protected]>
---
kernel/trace/blktrace.c | 100 +++++++++++++++++++++++++++++-----------------
1 files changed, 63 insertions(+), 37 deletions(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index f33c176..3b2ba28 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1316,53 +1316,77 @@ struct attribute_group blk_trace_attr_group = {
.attrs = blk_trace_attrs,
};

-static int blk_str2act_mask(const char *str)
+static struct {
+ int mask;
+ const char *str;
+} mask_maps[] = {
+ { BLK_TC_READ, "read" },
+ { BLK_TC_WRITE, "write" },
+ { BLK_TC_BARRIER, "barrier" },
+ { BLK_TC_SYNC, "sync" },
+ { BLK_TC_QUEUE, "queue" },
+ { BLK_TC_REQUEUE, "requeue" },
+ { BLK_TC_ISSUE, "issue" },
+ { BLK_TC_COMPLETE, "complete" },
+ { BLK_TC_FS, "fs" },
+ { BLK_TC_PC, "pc" },
+ { BLK_TC_AHEAD, "ahead" },
+ { BLK_TC_META, "meta" },
+ { BLK_TC_DISCARD, "discard" },
+ { BLK_TC_DRV_DATA, "drv_data" },
+};
+
+static int blk_trace_str2mask(const char *str)
{
+ int i;
int mask = 0;
- char *copy = kstrdup(str, GFP_KERNEL), *s;
+ char *s, *token;

- if (copy == NULL)
+ s = kstrdup(str, GFP_KERNEL);
+ if (s == NULL)
return -ENOMEM;
-
- s = strstrip(copy);
+ s = strstrip(s);

while (1) {
- char *sep = strchr(s, ',');
-
- if (sep != NULL)
- *sep = '\0';
-
- if (strcasecmp(s, "barrier") == 0)
- mask |= BLK_TC_BARRIER;
- else if (strcasecmp(s, "complete") == 0)
- mask |= BLK_TC_COMPLETE;
- else if (strcasecmp(s, "fs") == 0)
- mask |= BLK_TC_FS;
- else if (strcasecmp(s, "issue") == 0)
- mask |= BLK_TC_ISSUE;
- else if (strcasecmp(s, "pc") == 0)
- mask |= BLK_TC_PC;
- else if (strcasecmp(s, "queue") == 0)
- mask |= BLK_TC_QUEUE;
- else if (strcasecmp(s, "read") == 0)
- mask |= BLK_TC_READ;
- else if (strcasecmp(s, "requeue") == 0)
- mask |= BLK_TC_REQUEUE;
- else if (strcasecmp(s, "sync") == 0)
- mask |= BLK_TC_SYNC;
- else if (strcasecmp(s, "write") == 0)
- mask |= BLK_TC_WRITE;
-
- if (sep == NULL)
+ token = strsep(&s, ",");
+ if (token == NULL)
break;

- s = sep + 1;
+ if (*token == '\0')
+ continue;
+
+ for (i = 0; i < ARRAY_SIZE(mask_maps); i++) {
+ if (strcasecmp(token, mask_maps[i].str) == 0) {
+ mask |= mask_maps[i].mask;
+ break;
+ }
+ }
+ if (i == ARRAY_SIZE(mask_maps)) {
+ mask = -EINVAL;
+ break;
+ }
}
- kfree(copy);
+ kfree(s);

return mask;
}

+static ssize_t blk_trace_mask2str(char *buf, int mask)
+{
+ int i;
+ char *p = buf;
+
+ for (i = 0; i < ARRAY_SIZE(mask_maps); i++) {
+ if (mask & mask_maps[i].mask) {
+ p += sprintf(p, "%s%s",
+ (p == buf) ? "" : ",", mask_maps[i].str);
+ }
+ }
+ *p++ = '\n';
+
+ return (p - buf);
+}
+
static struct request_queue *blk_trace_get_queue(struct block_device *bdev)
{
if (bdev->bd_disk == NULL)
@@ -1399,7 +1423,7 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev,
if (q->blk_trace == NULL)
ret = sprintf(buf, "disabled\n");
else if (attr == &dev_attr_act_mask)
- ret = sprintf(buf, "%#x\n", q->blk_trace->act_mask);
+ ret = blk_trace_mask2str(buf, q->blk_trace->act_mask);
else if (attr == &dev_attr_pid)
ret = sprintf(buf, "%u\n", q->blk_trace->pid);
else if (attr == &dev_attr_start_lba)
@@ -1432,9 +1456,11 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
if (attr == &dev_attr_act_mask) {
if (sscanf(buf, "%llx", &value) != 1) {
/* Assume it is a list of trace category names */
- value = blk_str2act_mask(buf);
- if (value < 0)
+ value = blk_trace_str2mask(buf);
+ if (value < 0) {
+ ret = value;
goto out;
+ }
}
} else if (sscanf(buf, "%llu", &value) != 1)
goto out;
--
1.5.4.rc3

2009-03-24 08:28:30

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 3/5] blktrace: fix off-by-one bug


* Li Zefan <[email protected]> wrote:

> 'what' is used as the index of array what2act, so it can't >= the array size.
>
> Signed-off-by: Li Zefan <[email protected]>
> ---
> kernel/trace/blktrace.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 9af4143..0e91caa 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -1149,7 +1149,7 @@ static enum print_line_t blk_trace_event_print(struct trace_iterator *iter,
> if (!trace_print_context(iter))
> return TRACE_TYPE_PARTIAL_LINE;
>
> - if (unlikely(what == 0 || what > ARRAY_SIZE(what2act)))
> + if (unlikely(what == 0 || what >= ARRAY_SIZE(what2act)))
> ret = trace_seq_printf(s, "Bad pc action %x\n", what);
> else {
> const bool long_act = !!(trace_flags & TRACE_ITER_VERBOSE);
> @@ -1196,7 +1196,7 @@ static enum print_line_t blk_tracer_print_line(struct trace_iterator *iter)
> t = (const struct blk_io_trace *)iter->ent;
> what = t->action & ((1 << BLK_TC_SHIFT) - 1);
>
> - if (unlikely(what == 0 || what > ARRAY_SIZE(what2act)))
> + if (unlikely(what == 0 || what >= ARRAY_SIZE(what2act)))

ah, nice. How did you notice - did we miss "remap" events due to
this bug?

Ingo

2009-03-24 08:33:17

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH 3/5] blktrace: fix off-by-one bug

>> - if (unlikely(what == 0 || what > ARRAY_SIZE(what2act)))
>> + if (unlikely(what == 0 || what >= ARRAY_SIZE(what2act)))
>
> ah, nice. How did you notice - did we miss "remap" events due to
> this bug?
>

By code review, but we can get NULL dereference bug if we receive an
"abort" event, this event may be generated only when using device-mapper.

We don't print out this event currently, neither does the userspace
blktrace, which should be fixed.

2009-03-24 08:39:41

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 5/5] blktrace: print human-readable act_mask


* Li Zefan <[email protected]> wrote:

> Print stringified act_mask instead of hex value:
> # cat act_mask
> read,write,barrier,sync,queue,requeue,issue,complete,fs,pc,ahead,meta,
> discard,drv_data
> # echo "meta,write" > act_mask
> # cat act_mask
> write,meta

Nice!

It would also be nice to activate trace filters for the blktrace
tracepoints - i.e. to convert them to the TRACE_EVENT() enumeration
format. Beyond user-space parseable field enumeration and filter,
that will also speed up tracing and allows binary record streaming
with splice() zero-copy.

Via that "act_mask" can become a filterable field and you can define
expressions to filter. All other fields like sector become in-kernel
filterable too.

See a few examples here:

include/trace/irq_event_types.h
include/trace/sched_event_types.h

Note, blktrace tracepoints are certainly more complex than the
tracepoints above - you can embedd C statements in TRACE_EVENT()'s
TP_fast_assign() bit.

It was specifically designed to allow the support of blktrace
tracepoints, so you can embedd the blk_pc_request() and disk_devt()
translation for the block_rq_complete event or
block_rq_requeue/issue tracepoints.

> +static struct {
> + int mask;
> + const char *str;
> +} mask_maps[] = {
> + { BLK_TC_READ, "read" },
> + { BLK_TC_WRITE, "write" },
> + { BLK_TC_BARRIER, "barrier" },
> + { BLK_TC_SYNC, "sync" },
> + { BLK_TC_QUEUE, "queue" },
> + { BLK_TC_REQUEUE, "requeue" },
> + { BLK_TC_ISSUE, "issue" },
> + { BLK_TC_COMPLETE, "complete" },
> + { BLK_TC_FS, "fs" },
> + { BLK_TC_PC, "pc" },
> + { BLK_TC_AHEAD, "ahead" },
> + { BLK_TC_META, "meta" },
> + { BLK_TC_DISCARD, "discard" },
> + { BLK_TC_DRV_DATA, "drv_data" },
> +};

(minor nit: this should be a const.)

Jens, Arnaldo, Steve, the series from Li looks good to me - Ack?

Ingo

2009-03-24 08:40:28

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 5/5] blktrace: print human-readable act_mask


* Li Zefan <[email protected]> wrote:

> Print stringified act_mask instead of hex value:
> # cat act_mask
> read,write,barrier,sync,queue,requeue,issue,complete,fs,pc,ahead,meta,
> discard,drv_data
> # echo "meta,write" > act_mask
> # cat act_mask
> write,meta

btw., does user-space blktrace make use of this facility?

Ingo

2009-03-24 08:41:21

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH 3/5] blktrace: fix off-by-one bug

Li Zefan wrote:
>>> - if (unlikely(what == 0 || what > ARRAY_SIZE(what2act)))
>>> + if (unlikely(what == 0 || what >= ARRAY_SIZE(what2act)))
>> ah, nice. How did you notice - did we miss "remap" events due to
>> this bug?
>>

forgot to mention, we didn't miss any "remap" events.

>
> By code review, but we can get NULL dereference bug if we receive an
> "abort" event, this event may be generated only when using device-mapper.
>

and not NULL dereference, but accessing invalid memory.

what2act["abort"]->print(...)

and "abort" == ARRAY_SIZE(what2act).

> We don't print out this event currently, neither does the userspace
> blktrace, which should be fixed.
>

2009-03-24 08:44:35

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 4/5] blktrace: fix t_error()


* Li Zefan <[email protected]> wrote:

> t_error() should return t->error but not t->sector.

Nice. This bug got introduced by the blktrace->ftrace conversion.
Old blktrace passed t->error straight to user-space.

Ingo

2009-03-24 08:46:19

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH 5/5] blktrace: print human-readable act_mask

Ingo Molnar wrote:
> * Li Zefan <[email protected]> wrote:
>
>> Print stringified act_mask instead of hex value:
>> # cat act_mask
>> read,write,barrier,sync,queue,requeue,issue,complete,fs,pc,ahead,meta,
>> discard,drv_data
>> # echo "meta,write" > act_mask
>> # cat act_mask
>> write,meta
>
> btw., does user-space blktrace make use of this facility?
>

This facility can be used when we are using ftrace blktrace.

The userspace blktrace gets the filter-list via [-A hex-mask] and [-a str-mask].

like: blktrace -d /dev/sda -a write -a meta

2009-03-24 08:47:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 3/5] blktrace: fix off-by-one bug


* Li Zefan <[email protected]> wrote:

> Li Zefan wrote:
> >>> - if (unlikely(what == 0 || what > ARRAY_SIZE(what2act)))
> >>> + if (unlikely(what == 0 || what >= ARRAY_SIZE(what2act)))
> >> ah, nice. How did you notice - did we miss "remap" events due to
> >> this bug?
> >>
>
> forgot to mention, we didn't miss any "remap" events.
>
> >
> > By code review, but we can get NULL dereference bug if we receive an
> > "abort" event, this event may be generated only when using device-mapper.
> >
>
> and not NULL dereference, but accessing invalid memory.
>
> what2act["abort"]->print(...)
>
> and "abort" == ARRAY_SIZE(what2act).

Ah. This:

[__BLK_TA_QUEUE] = {{ "Q", "queue" }, blk_log_generic },
[__BLK_TA_BACKMERGE] = {{ "M", "backmerge" }, blk_log_generic },
[__BLK_TA_FRONTMERGE] = {{ "F", "frontmerge" }, blk_log_generic },
[__BLK_TA_GETRQ] = {{ "G", "getrq" }, blk_log_generic },
[__BLK_TA_SLEEPRQ] = {{ "S", "sleeprq" }, blk_log_generic },
[__BLK_TA_REQUEUE] = {{ "R", "requeue" }, blk_log_with_error },
[__BLK_TA_ISSUE] = {{ "D", "issue" }, blk_log_generic },
[__BLK_TA_COMPLETE] = {{ "C", "complete" }, blk_log_with_error },
[__BLK_TA_PLUG] = {{ "P", "plug" }, blk_log_plug },
[__BLK_TA_UNPLUG_IO] = {{ "U", "unplug_io" }, blk_log_unplug },
[__BLK_TA_UNPLUG_TIMER] = {{ "UT", "unplug_timer" }, blk_log_unplug },
[__BLK_TA_INSERT] = {{ "I", "insert" }, blk_log_generic },
[__BLK_TA_SPLIT] = {{ "X", "split" }, blk_log_split },
[__BLK_TA_BOUNCE] = {{ "B", "bounce" }, blk_log_generic },
[__BLK_TA_REMAP] = {{ "A", "remap" }, blk_log_remap },

does not have a __BLK_TA_ABORT entry currently - it should have,
right?

Ingo

2009-03-24 08:49:54

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH 3/5] blktrace: fix off-by-one bug

> Ah. This:
>
> [__BLK_TA_QUEUE] = {{ "Q", "queue" }, blk_log_generic },
> [__BLK_TA_BACKMERGE] = {{ "M", "backmerge" }, blk_log_generic },
> [__BLK_TA_FRONTMERGE] = {{ "F", "frontmerge" }, blk_log_generic },
> [__BLK_TA_GETRQ] = {{ "G", "getrq" }, blk_log_generic },
> [__BLK_TA_SLEEPRQ] = {{ "S", "sleeprq" }, blk_log_generic },
> [__BLK_TA_REQUEUE] = {{ "R", "requeue" }, blk_log_with_error },
> [__BLK_TA_ISSUE] = {{ "D", "issue" }, blk_log_generic },
> [__BLK_TA_COMPLETE] = {{ "C", "complete" }, blk_log_with_error },
> [__BLK_TA_PLUG] = {{ "P", "plug" }, blk_log_plug },
> [__BLK_TA_UNPLUG_IO] = {{ "U", "unplug_io" }, blk_log_unplug },
> [__BLK_TA_UNPLUG_TIMER] = {{ "UT", "unplug_timer" }, blk_log_unplug },
> [__BLK_TA_INSERT] = {{ "I", "insert" }, blk_log_generic },
> [__BLK_TA_SPLIT] = {{ "X", "split" }, blk_log_split },
> [__BLK_TA_BOUNCE] = {{ "B", "bounce" }, blk_log_generic },
> [__BLK_TA_REMAP] = {{ "A", "remap" }, blk_log_remap },
>
> does not have a __BLK_TA_ABORT entry currently - it should have,
> right?
>

Right, currently I'm not sure if blk_log_with_error is suitable for "abort" event,
so I haven't fixed it yet. :)

2009-03-24 09:01:21

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH 5/5] blktrace: print human-readable act_mask

Ingo Molnar wrote:
> * Li Zefan <[email protected]> wrote:
>
>> Print stringified act_mask instead of hex value:
>> # cat act_mask
>> read,write,barrier,sync,queue,requeue,issue,complete,fs,pc,ahead,meta,
>> discard,drv_data
>> # echo "meta,write" > act_mask
>> # cat act_mask
>> write,meta
>
> Nice!
>
> It would also be nice to activate trace filters for the blktrace
> tracepoints - i.e. to convert them to the TRACE_EVENT() enumeration
> format. Beyond user-space parseable field enumeration and filter,
> that will also speed up tracing and allows binary record streaming
> with splice() zero-copy.
>
> Via that "act_mask" can become a filterable field and you can define
> expressions to filter. All other fields like sector become in-kernel
> filterable too.
>
> See a few examples here:
>
> include/trace/irq_event_types.h
> include/trace/sched_event_types.h
>
> Note, blktrace tracepoints are certainly more complex than the
> tracepoints above - you can embedd C statements in TRACE_EVENT()'s
> TP_fast_assign() bit.
>
> It was specifically designed to allow the support of blktrace
> tracepoints, so you can embedd the blk_pc_request() and disk_devt()
> translation for the block_rq_complete event or
> block_rq_requeue/issue tracepoints.
>

I'll look into this when I have time. :)

2009-03-24 09:43:24

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH 5/5] blktrace: print human-readable act_mask

> @@ -1432,9 +1456,11 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
> if (attr == &dev_attr_act_mask) {
> if (sscanf(buf, "%llx", &value) != 1) {
> /* Assume it is a list of trace category names */
> - value = blk_str2act_mask(buf);
> - if (value < 0)
> + value = blk_trace_str2mask(buf);
> + if (value < 0) {
> + ret = value;
> goto out;
> + }

value is u64, it can < 0.


===================


From: Li Zefan <[email protected]>
Subject: [PATCH] blktrace: print human-readable act_mask

Print stringified act_mask instead of hex value:
# cat act_mask
read,write,barrier,sync,queue,requeue,issue,complete,fs,pc,ahead,meta,
discard,drv_data
# echo "meta,write" > act_mask
# cat act_mask
write,meta

Also:
- make act_mask accept "ahead", "meta", "discard" and "drv_data"
- use strsep() instead of strchr() to parse user input
- return -EINVAL if a token is not found in the mask map
- fix a bug that 'value' is unsigned, so it can < 0
- propagate error value of blk_trace_mask2str() to userspace, but not
always return -ENXIO.

Signed-off-by: Li Zefan <[email protected]>
---
kernel/trace/blktrace.c | 103 +++++++++++++++++++++++++++++-----------------
1 files changed, 65 insertions(+), 38 deletions(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index f33c176..9691b42 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1316,53 +1316,77 @@ struct attribute_group blk_trace_attr_group = {
.attrs = blk_trace_attrs,
};

-static int blk_str2act_mask(const char *str)
+static const struct {
+ int mask;
+ const char *str;
+} mask_maps[] = {
+ { BLK_TC_READ, "read" },
+ { BLK_TC_WRITE, "write" },
+ { BLK_TC_BARRIER, "barrier" },
+ { BLK_TC_SYNC, "sync" },
+ { BLK_TC_QUEUE, "queue" },
+ { BLK_TC_REQUEUE, "requeue" },
+ { BLK_TC_ISSUE, "issue" },
+ { BLK_TC_COMPLETE, "complete" },
+ { BLK_TC_FS, "fs" },
+ { BLK_TC_PC, "pc" },
+ { BLK_TC_AHEAD, "ahead" },
+ { BLK_TC_META, "meta" },
+ { BLK_TC_DISCARD, "discard" },
+ { BLK_TC_DRV_DATA, "drv_data" },
+};
+
+static int blk_trace_str2mask(const char *str)
{
+ int i;
int mask = 0;
- char *copy = kstrdup(str, GFP_KERNEL), *s;
+ char *s, *token;

- if (copy == NULL)
+ s = kstrdup(str, GFP_KERNEL);
+ if (s == NULL)
return -ENOMEM;
-
- s = strstrip(copy);
+ s = strstrip(s);

while (1) {
- char *sep = strchr(s, ',');
-
- if (sep != NULL)
- *sep = '\0';
-
- if (strcasecmp(s, "barrier") == 0)
- mask |= BLK_TC_BARRIER;
- else if (strcasecmp(s, "complete") == 0)
- mask |= BLK_TC_COMPLETE;
- else if (strcasecmp(s, "fs") == 0)
- mask |= BLK_TC_FS;
- else if (strcasecmp(s, "issue") == 0)
- mask |= BLK_TC_ISSUE;
- else if (strcasecmp(s, "pc") == 0)
- mask |= BLK_TC_PC;
- else if (strcasecmp(s, "queue") == 0)
- mask |= BLK_TC_QUEUE;
- else if (strcasecmp(s, "read") == 0)
- mask |= BLK_TC_READ;
- else if (strcasecmp(s, "requeue") == 0)
- mask |= BLK_TC_REQUEUE;
- else if (strcasecmp(s, "sync") == 0)
- mask |= BLK_TC_SYNC;
- else if (strcasecmp(s, "write") == 0)
- mask |= BLK_TC_WRITE;
-
- if (sep == NULL)
+ token = strsep(&s, ",");
+ if (token == NULL)
break;

- s = sep + 1;
+ if (*token == '\0')
+ continue;
+
+ for (i = 0; i < ARRAY_SIZE(mask_maps); i++) {
+ if (strcasecmp(token, mask_maps[i].str) == 0) {
+ mask |= mask_maps[i].mask;
+ break;
+ }
+ }
+ if (i == ARRAY_SIZE(mask_maps)) {
+ mask = -EINVAL;
+ break;
+ }
}
- kfree(copy);
+ kfree(s);

return mask;
}

+static ssize_t blk_trace_mask2str(char *buf, int mask)
+{
+ int i;
+ char *p = buf;
+
+ for (i = 0; i < ARRAY_SIZE(mask_maps); i++) {
+ if (mask & mask_maps[i].mask) {
+ p += sprintf(p, "%s%s",
+ (p == buf) ? "" : ",", mask_maps[i].str);
+ }
+ }
+ *p++ = '\n';
+
+ return p - buf;
+}
+
static struct request_queue *blk_trace_get_queue(struct block_device *bdev)
{
if (bdev->bd_disk == NULL)
@@ -1399,7 +1423,7 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev,
if (q->blk_trace == NULL)
ret = sprintf(buf, "disabled\n");
else if (attr == &dev_attr_act_mask)
- ret = sprintf(buf, "%#x\n", q->blk_trace->act_mask);
+ ret = blk_trace_mask2str(buf, q->blk_trace->act_mask);
else if (attr == &dev_attr_pid)
ret = sprintf(buf, "%u\n", q->blk_trace->pid);
else if (attr == &dev_attr_start_lba)
@@ -1424,7 +1448,7 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
struct request_queue *q;
struct hd_struct *p;
u64 value;
- ssize_t ret = -ENXIO;
+ ssize_t ret = -EINVAL;

if (count == 0)
goto out;
@@ -1432,13 +1456,16 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
if (attr == &dev_attr_act_mask) {
if (sscanf(buf, "%llx", &value) != 1) {
/* Assume it is a list of trace category names */
- value = blk_str2act_mask(buf);
- if (value < 0)
+ ret = blk_trace_str2mask(buf);
+ if (ret < 0)
goto out;
+ value = ret;
}
} else if (sscanf(buf, "%llu", &value) != 1)
goto out;

+ ret = -ENXIO;
+
lock_kernel();
p = dev_to_part(dev);
bdev = bdget(part_devt(p));
--
1.5.4.rc3

2009-03-24 09:48:46

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 5/5] blktrace: print human-readable act_mask


* Li Zefan <[email protected]> wrote:

> > @@ -1432,9 +1456,11 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
> > if (attr == &dev_attr_act_mask) {
> > if (sscanf(buf, "%llx", &value) != 1) {
> > /* Assume it is a list of trace category names */
> > - value = blk_str2act_mask(buf);
> > - if (value < 0)
> > + value = blk_trace_str2mask(buf);
> > + if (value < 0) {
> > + ret = value;
> > goto out;
> > + }
>
> value is u64, it can < 0.

Ok. Small nit: please in such cases change the subject line back to
something like:

[PATCH 5/5, v2] blktrace: print human-readable act_mask

So that's it's easy to see which is the latest version, in thread
view.

Ingo

2009-03-24 11:49:52

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 0/5] blktrace: various cleanups and fixes, part 2

On Tue, Mar 24 2009, Li Zefan wrote:
> The first part is here:
> http://lkml.org/lkml/2009/3/19/475
>
> And this is the 2nd part, and I still have some pending fixes..
>
> Each patch is a small one, except the last one, which is still not big.
>
> [PATCH 1/5] blktrace: mark ddir_act and what2act const
> [PATCH 2/5] blktrace: fix wrong calculation of RWBS
> [PATCH 3/5] blktrace: fix off-by-one bug
> [PATCH 4/5] blktrace: fix t_error()
> [PATCH 5/5] blktrace: print human-readable act_mask
>
> blktrace.c | 127 ++++++++++++++++++++++++++++++++++++-------------------------
> 1 file changed, 77 insertions(+), 50 deletions(-)
>
> Signed-off-by: Li Zefan <[email protected]>

Looks good, as does your previous series. You can add my acked-by to
both series.

--
Jens Axboe

2009-03-24 12:11:10

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/5] blktrace: various cleanups and fixes, part 2


* Jens Axboe <[email protected]> wrote:

> On Tue, Mar 24 2009, Li Zefan wrote:
> > The first part is here:
> > http://lkml.org/lkml/2009/3/19/475
> >
> > And this is the 2nd part, and I still have some pending fixes..
> >
> > Each patch is a small one, except the last one, which is still not big.
> >
> > [PATCH 1/5] blktrace: mark ddir_act and what2act const
> > [PATCH 2/5] blktrace: fix wrong calculation of RWBS
> > [PATCH 3/5] blktrace: fix off-by-one bug
> > [PATCH 4/5] blktrace: fix t_error()
> > [PATCH 5/5] blktrace: print human-readable act_mask
> >
> > blktrace.c | 127 ++++++++++++++++++++++++++++++++++++-------------------------
> > 1 file changed, 77 insertions(+), 50 deletions(-)
> >
> > Signed-off-by: Li Zefan <[email protected]>
>
> Looks good, as does your previous series. You can add my acked-by
> to both series.

Thanks Jens, applied.

Ingo

2009-03-24 12:13:26

by Li Zefan

[permalink] [raw]
Subject: [tip:tracing/blktrace] blktrace: mark ddir_act[] const

Commit-ID: e4955c9986a27bb47ddeb6cd55803053f547e2e9
Gitweb: http://git.kernel.org/tip/e4955c9986a27bb47ddeb6cd55803053f547e2e9
Author: Li Zefan <[email protected]>
AuthorDate: Tue, 24 Mar 2009 16:04:37 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Tue, 24 Mar 2009 13:08:59 +0100

blktrace: mark ddir_act[] const

Impact: cleanup

ddir_act and what2act always stay immutable.

Signed-off-by: Li Zefan <[email protected]>
Acked-by: Jens Axboe <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
kernel/trace/blktrace.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 108f4f7..1ffcbd4 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -147,8 +147,8 @@ static int act_log_check(struct blk_trace *bt, u32 what, sector_t sector,
/*
* Data direction bit lookup
*/
-static u32 ddir_act[2] __read_mostly = { BLK_TC_ACT(BLK_TC_READ),
- BLK_TC_ACT(BLK_TC_WRITE) };
+static const u32 ddir_act[2] = { BLK_TC_ACT(BLK_TC_READ),
+ BLK_TC_ACT(BLK_TC_WRITE) };

/* The ilog2() calls fall out because they're constant */
#define MASK_TC_BIT(rw, __name) ((rw & (1 << BIO_RW_ ## __name)) << \
@@ -1116,10 +1116,10 @@ static void blk_tracer_reset(struct trace_array *tr)
blk_tracer_stop(tr);
}

-static struct {
+static const struct {
const char *act[2];
int (*print)(struct trace_seq *s, const struct trace_entry *ent);
-} what2act[] __read_mostly = {
+} what2act[] = {
[__BLK_TA_QUEUE] = {{ "Q", "queue" }, blk_log_generic },
[__BLK_TA_BACKMERGE] = {{ "M", "backmerge" }, blk_log_generic },
[__BLK_TA_FRONTMERGE] = {{ "F", "frontmerge" }, blk_log_generic },

2009-03-24 12:13:44

by Li Zefan

[permalink] [raw]
Subject: [tip:tracing/blktrace] blktrace: fix t_error()

Commit-ID: e0dc81bec0927fa0c8aabc521793161909eef7a5
Gitweb: http://git.kernel.org/tip/e0dc81bec0927fa0c8aabc521793161909eef7a5
Author: Li Zefan <[email protected]>
AuthorDate: Tue, 24 Mar 2009 16:05:51 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Tue, 24 Mar 2009 13:09:00 +0100

blktrace: fix t_error()

Impact: fix error flag output

t_error() should return t->error but not t->sector.

Signed-off-by: Li Zefan <[email protected]>
Acked-by: Jens Axboe <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
kernel/trace/blktrace.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 9af4143..f69f8bd 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -968,7 +968,7 @@ static inline unsigned long long t_sector(const struct trace_entry *ent)

static inline __u16 t_error(const struct trace_entry *ent)
{
- return te_blk_io_trace(ent)->sector;
+ return te_blk_io_trace(ent)->error;
}

static __u64 get_pdu_int(const struct trace_entry *ent)

2009-03-24 12:14:22

by Li Zefan

[permalink] [raw]
Subject: [tip:tracing/blktrace] blktrace: print human-readable act_mask

Commit-ID: 093419971e03362a00f499960557c119982ea09f
Gitweb: http://git.kernel.org/tip/093419971e03362a00f499960557c119982ea09f
Author: Li Zefan <[email protected]>
AuthorDate: Tue, 24 Mar 2009 17:43:30 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Tue, 24 Mar 2009 13:09:00 +0100

blktrace: print human-readable act_mask

Impact: new feature, allow symbolic values in /debug/tracing/act_mask

Print stringified act_mask instead of hex value:

# cat act_mask
read,write,barrier,sync,queue,requeue,issue,complete,fs,pc,ahead,meta,
discard,drv_data
# echo "meta,write" > act_mask
# cat act_mask
write,meta

Also:
- make act_mask accept "ahead", "meta", "discard" and "drv_data"
- use strsep() instead of strchr() to parse user input
- return -EINVAL if a token is not found in the mask map
- fix a bug that 'value' is unsigned, so it can < 0
- propagate error value of blk_trace_mask2str() to userspace, but not
always return -ENXIO.

Signed-off-by: Li Zefan <[email protected]>
Acked-by: Jens Axboe <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
kernel/trace/blktrace.c | 103 +++++++++++++++++++++++++++++-----------------
1 files changed, 65 insertions(+), 38 deletions(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index f69f8bd..6fb274f 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1316,53 +1316,77 @@ struct attribute_group blk_trace_attr_group = {
.attrs = blk_trace_attrs,
};

-static int blk_str2act_mask(const char *str)
+static const struct {
+ int mask;
+ const char *str;
+} mask_maps[] = {
+ { BLK_TC_READ, "read" },
+ { BLK_TC_WRITE, "write" },
+ { BLK_TC_BARRIER, "barrier" },
+ { BLK_TC_SYNC, "sync" },
+ { BLK_TC_QUEUE, "queue" },
+ { BLK_TC_REQUEUE, "requeue" },
+ { BLK_TC_ISSUE, "issue" },
+ { BLK_TC_COMPLETE, "complete" },
+ { BLK_TC_FS, "fs" },
+ { BLK_TC_PC, "pc" },
+ { BLK_TC_AHEAD, "ahead" },
+ { BLK_TC_META, "meta" },
+ { BLK_TC_DISCARD, "discard" },
+ { BLK_TC_DRV_DATA, "drv_data" },
+};
+
+static int blk_trace_str2mask(const char *str)
{
+ int i;
int mask = 0;
- char *copy = kstrdup(str, GFP_KERNEL), *s;
+ char *s, *token;

- if (copy == NULL)
+ s = kstrdup(str, GFP_KERNEL);
+ if (s == NULL)
return -ENOMEM;
-
- s = strstrip(copy);
+ s = strstrip(s);

while (1) {
- char *sep = strchr(s, ',');
-
- if (sep != NULL)
- *sep = '\0';
-
- if (strcasecmp(s, "barrier") == 0)
- mask |= BLK_TC_BARRIER;
- else if (strcasecmp(s, "complete") == 0)
- mask |= BLK_TC_COMPLETE;
- else if (strcasecmp(s, "fs") == 0)
- mask |= BLK_TC_FS;
- else if (strcasecmp(s, "issue") == 0)
- mask |= BLK_TC_ISSUE;
- else if (strcasecmp(s, "pc") == 0)
- mask |= BLK_TC_PC;
- else if (strcasecmp(s, "queue") == 0)
- mask |= BLK_TC_QUEUE;
- else if (strcasecmp(s, "read") == 0)
- mask |= BLK_TC_READ;
- else if (strcasecmp(s, "requeue") == 0)
- mask |= BLK_TC_REQUEUE;
- else if (strcasecmp(s, "sync") == 0)
- mask |= BLK_TC_SYNC;
- else if (strcasecmp(s, "write") == 0)
- mask |= BLK_TC_WRITE;
-
- if (sep == NULL)
+ token = strsep(&s, ",");
+ if (token == NULL)
break;

- s = sep + 1;
+ if (*token == '\0')
+ continue;
+
+ for (i = 0; i < ARRAY_SIZE(mask_maps); i++) {
+ if (strcasecmp(token, mask_maps[i].str) == 0) {
+ mask |= mask_maps[i].mask;
+ break;
+ }
+ }
+ if (i == ARRAY_SIZE(mask_maps)) {
+ mask = -EINVAL;
+ break;
+ }
}
- kfree(copy);
+ kfree(s);

return mask;
}

+static ssize_t blk_trace_mask2str(char *buf, int mask)
+{
+ int i;
+ char *p = buf;
+
+ for (i = 0; i < ARRAY_SIZE(mask_maps); i++) {
+ if (mask & mask_maps[i].mask) {
+ p += sprintf(p, "%s%s",
+ (p == buf) ? "" : ",", mask_maps[i].str);
+ }
+ }
+ *p++ = '\n';
+
+ return p - buf;
+}
+
static struct request_queue *blk_trace_get_queue(struct block_device *bdev)
{
if (bdev->bd_disk == NULL)
@@ -1399,7 +1423,7 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev,
if (q->blk_trace == NULL)
ret = sprintf(buf, "disabled\n");
else if (attr == &dev_attr_act_mask)
- ret = sprintf(buf, "%#x\n", q->blk_trace->act_mask);
+ ret = blk_trace_mask2str(buf, q->blk_trace->act_mask);
else if (attr == &dev_attr_pid)
ret = sprintf(buf, "%u\n", q->blk_trace->pid);
else if (attr == &dev_attr_start_lba)
@@ -1424,7 +1448,7 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
struct request_queue *q;
struct hd_struct *p;
u64 value;
- ssize_t ret = -ENXIO;
+ ssize_t ret = -EINVAL;

if (count == 0)
goto out;
@@ -1432,13 +1456,16 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
if (attr == &dev_attr_act_mask) {
if (sscanf(buf, "%llx", &value) != 1) {
/* Assume it is a list of trace category names */
- value = blk_str2act_mask(buf);
- if (value < 0)
+ ret = blk_trace_str2mask(buf);
+ if (ret < 0)
goto out;
+ value = ret;
}
} else if (sscanf(buf, "%llu", &value) != 1)
goto out;

+ ret = -ENXIO;
+
lock_kernel();
p = dev_to_part(dev);
bdev = bdget(part_devt(p));

2009-03-24 12:13:59

by Li Zefan

[permalink] [raw]
Subject: [tip:tracing/blktrace] blktrace: fix wrong calculation of RWBS

Commit-ID: 65796348e09880e12b97267d39b8857c758440a6
Gitweb: http://git.kernel.org/tip/65796348e09880e12b97267d39b8857c758440a6
Author: Li Zefan <[email protected]>
AuthorDate: Tue, 24 Mar 2009 16:05:06 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Tue, 24 Mar 2009 13:08:59 +0100

blktrace: fix wrong calculation of RWBS

Impact: fix the output of IO type category characters

Trace categories are the upper 16 bits, not the lower 16 bits.

Signed-off-by: Li Zefan <[email protected]>
Acked-by: Jens Axboe <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
kernel/trace/blktrace.c | 13 +++++++------
1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 1ffcbd4..9af4143 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -922,23 +922,24 @@ static void blk_unregister_tracepoints(void)
static void fill_rwbs(char *rwbs, const struct blk_io_trace *t)
{
int i = 0;
+ int tc = t->action >> BLK_TC_SHIFT;

- if (t->action & BLK_TC_DISCARD)
+ if (tc & BLK_TC_DISCARD)
rwbs[i++] = 'D';
- else if (t->action & BLK_TC_WRITE)
+ else if (tc & BLK_TC_WRITE)
rwbs[i++] = 'W';
else if (t->bytes)
rwbs[i++] = 'R';
else
rwbs[i++] = 'N';

- if (t->action & BLK_TC_AHEAD)
+ if (tc & BLK_TC_AHEAD)
rwbs[i++] = 'A';
- if (t->action & BLK_TC_BARRIER)
+ if (tc & BLK_TC_BARRIER)
rwbs[i++] = 'B';
- if (t->action & BLK_TC_SYNC)
+ if (tc & BLK_TC_SYNC)
rwbs[i++] = 'S';
- if (t->action & BLK_TC_META)
+ if (tc & BLK_TC_META)
rwbs[i++] = 'M';

rwbs[i] = '\0';

2009-03-24 12:49:34

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 1/5] blktrace: mark ddir_act and what2act const

Em Tue, Mar 24, 2009 at 04:04:37PM +0800, Li Zefan escreveu:
> ddir_act and what2act are always stay immutable.
>
> Signed-off-by: Li Zefan <[email protected]>
> ---
> kernel/trace/blktrace.c | 8 ++++----
> 1 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 108f4f7..1ffcbd4 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -147,8 +147,8 @@ static int act_log_check(struct blk_trace *bt, u32 what, sector_t sector,
> /*
> * Data direction bit lookup
> */
> -static u32 ddir_act[2] __read_mostly = { BLK_TC_ACT(BLK_TC_READ),
> - BLK_TC_ACT(BLK_TC_WRITE) };
> +static const u32 ddir_act[2] = { BLK_TC_ACT(BLK_TC_READ),
> + BLK_TC_ACT(BLK_TC_WRITE) };
>
> /* The ilog2() calls fall out because they're constant */
> #define MASK_TC_BIT(rw, __name) ((rw & (1 << BIO_RW_ ## __name)) << \
> @@ -1116,10 +1116,10 @@ static void blk_tracer_reset(struct trace_array *tr)
> blk_tracer_stop(tr);
> }
>
> -static struct {
> +static const struct {
> const char *act[2];
> int (*print)(struct trace_seq *s, const struct trace_entry *ent);
> -} what2act[] __read_mostly = {
> +} what2act[] = {
> [__BLK_TA_QUEUE] = {{ "Q", "queue" }, blk_log_generic },
> [__BLK_TA_BACKMERGE] = {{ "M", "backmerge" }, blk_log_generic },
> [__BLK_TA_FRONTMERGE] = {{ "F", "frontmerge" }, blk_log_generic },

Acked-by: Arnaldo Carvalho de Melo <[email protected]>

2009-03-24 12:51:46

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 4/5] blktrace: fix t_error()

Em Tue, Mar 24, 2009 at 04:05:51PM +0800, Li Zefan escreveu:
> t_error() should return t->error but not t->sector.
>
> Signed-off-by: Li Zefan <[email protected]>
> ---
> kernel/trace/blktrace.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 0e91caa..f33c176 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -968,7 +968,7 @@ static inline unsigned long long t_sector(const struct trace_entry *ent)
>
> static inline __u16 t_error(const struct trace_entry *ent)
> {
> - return te_blk_io_trace(ent)->sector;
> + return te_blk_io_trace(ent)->error;
> }
>
> static __u64 get_pdu_int(const struct trace_entry *ent)

Doh, thanks a lot, c'n'paste at its "best"!

Acked-by: Arnaldo Carvalho de Melo <[email protected]>

2009-03-24 15:48:45

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 5/5] blktrace: print human-readable act_mask


On Tue, 24 Mar 2009, Li Zefan wrote:
> +static ssize_t blk_trace_mask2str(char *buf, int mask)
> +{
> + int i;
> + char *p = buf;
> +
> + for (i = 0; i < ARRAY_SIZE(mask_maps); i++) {
> + if (mask & mask_maps[i].mask) {
> + p += sprintf(p, "%s%s",
> + (p == buf) ? "" : ",", mask_maps[i].str);
> + }
> + }
> + *p++ = '\n';
> +
> + return (p - buf);

The above is still simple, but if it starts to ge complex, you might want
to convert it to the trace_seq functions. That can make moving to a buffer
easier.

#include "trace_output.h"

static bool blk_trace_mask2str(struct trace_seq *s, int mask)
{
bool comma = false;
int i;

for (i = 0; i < ARRAY_SIZE(mask_maps); i++) {
if (mask & mask_maps[i].mask) {
trace_print_seq(s, "%s%s",
comma ? "," : "",
mask_maps[i].str);
comma = 1;
}
}
}


struct trace_seq *s;

s = kmalloc(sizeof(*s), GFP_KERNEL);
if (!s)
do error;


blk_trace_str2mask(s, mask);

This would put the data into s->buffer and have a length too.

Just letting you know about this. It may come in handy. Most of the code
in kernel/trace*.c uses this.

-- Steve


> +}
> +
> static struct request_queue *blk_trace_get_queue(struct block_device *bdev)
> {
> if (bdev->bd_disk == NULL)
> @@ -1399,7 +1423,7 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev,
> if (q->blk_trace == NULL)
> ret = sprintf(buf, "disabled\n");
> else if (attr == &dev_attr_act_mask)
> - ret = sprintf(buf, "%#x\n", q->blk_trace->act_mask);
> + ret = blk_trace_mask2str(buf, q->blk_trace->act_mask);
> else if (attr == &dev_attr_pid)
> ret = sprintf(buf, "%u\n", q->blk_trace->pid);
> else if (attr == &dev_attr_start_lba)
> @@ -1432,9 +1456,11 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
> if (attr == &dev_attr_act_mask) {
> if (sscanf(buf, "%llx", &value) != 1) {
> /* Assume it is a list of trace category names */
> - value = blk_str2act_mask(buf);
> - if (value < 0)
> + value = blk_trace_str2mask(buf);
> + if (value < 0) {
> + ret = value;
> goto out;
> + }
> }
> } else if (sscanf(buf, "%llu", &value) != 1)
> goto out;
> --
> 1.5.4.rc3
>
>

2009-03-24 15:49:58

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 5/5] blktrace: print human-readable act_mask



On Tue, 24 Mar 2009, Li Zefan wrote:
>
> +static ssize_t blk_trace_mask2str(char *buf, int mask)
> +{
> + int i;
> + char *p = buf;
> +
> + for (i = 0; i < ARRAY_SIZE(mask_maps); i++) {
> + if (mask & mask_maps[i].mask) {
> + p += sprintf(p, "%s%s",
> + (p == buf) ? "" : ",", mask_maps[i].str);
> + }
> + }
> + *p++ = '\n';
> +
> + return (p - buf);
> +}
> +
> static struct request_queue *blk_trace_get_queue(struct block_device *bdev)
> {
> if (bdev->bd_disk == NULL)
> @@ -1399,7 +1423,7 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev,
> if (q->blk_trace == NULL)
> ret = sprintf(buf, "disabled\n");
> else if (attr == &dev_attr_act_mask)
> - ret = sprintf(buf, "%#x\n", q->blk_trace->act_mask);
> + ret = blk_trace_mask2str(buf, q->blk_trace->act_mask);
> else if (attr == &dev_attr_pid)
> ret = sprintf(buf, "%u\n", q->blk_trace->pid);
> else if (attr == &dev_attr_start_lba)
> @@ -1432,9 +1456,11 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
> if (attr == &dev_attr_act_mask) {
> if (sscanf(buf, "%llx", &value) != 1) {
> /* Assume it is a list of trace category names */
> - value = blk_str2act_mask(buf);
> - if (value < 0)
> + value = blk_trace_str2mask(buf);

Hmm, does this compile? Doesn't blk_trace_str2mask need two parameters?

-- Steve

> + if (value < 0) {
> + ret = value;
> goto out;
> + }
> }
> } else if (sscanf(buf, "%llu", &value) != 1)
> goto out;
> --
> 1.5.4.rc3
>
>

2009-03-24 15:56:40

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 5/5] blktrace: print human-readable act_mask


* Steven Rostedt <[email protected]> wrote:

>
>
> On Tue, 24 Mar 2009, Li Zefan wrote:
> >
> > +static ssize_t blk_trace_mask2str(char *buf, int mask)
> > +{
> > + int i;
> > + char *p = buf;
> > +
> > + for (i = 0; i < ARRAY_SIZE(mask_maps); i++) {
> > + if (mask & mask_maps[i].mask) {
> > + p += sprintf(p, "%s%s",
> > + (p == buf) ? "" : ",", mask_maps[i].str);
> > + }
> > + }
> > + *p++ = '\n';
> > +
> > + return (p - buf);
> > +}
> > +
> > static struct request_queue *blk_trace_get_queue(struct block_device *bdev)
> > {
> > if (bdev->bd_disk == NULL)
> > @@ -1399,7 +1423,7 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev,
> > if (q->blk_trace == NULL)
> > ret = sprintf(buf, "disabled\n");
> > else if (attr == &dev_attr_act_mask)
> > - ret = sprintf(buf, "%#x\n", q->blk_trace->act_mask);
> > + ret = blk_trace_mask2str(buf, q->blk_trace->act_mask);
> > else if (attr == &dev_attr_pid)
> > ret = sprintf(buf, "%u\n", q->blk_trace->pid);
> > else if (attr == &dev_attr_start_lba)
> > @@ -1432,9 +1456,11 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
> > if (attr == &dev_attr_act_mask) {
> > if (sscanf(buf, "%llx", &value) != 1) {
> > /* Assume it is a list of trace category names */
> > - value = blk_str2act_mask(buf);
> > - if (value < 0)
> > + value = blk_trace_str2mask(buf);
>
> Hmm, does this compile? Doesn't blk_trace_str2mask need two parameters?

No, it needs one parameter.

Ingo

2009-03-24 16:04:57

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 5/5] blktrace: print human-readable act_mask




On Tue, 24 Mar 2009, Ingo Molnar wrote:

>
> * Steven Rostedt <[email protected]> wrote:
>
> >
> >
> > On Tue, 24 Mar 2009, Li Zefan wrote:
> > >
> > > +static ssize_t blk_trace_mask2str(char *buf, int mask)

> > > else if (attr == &dev_attr_act_mask)
> > > - ret = sprintf(buf, "%#x\n", q->blk_trace->act_mask);
> > > + ret = blk_trace_mask2str(buf, q->blk_trace->act_mask);


> > > - value = blk_str2act_mask(buf);
> > > - if (value < 0)
> > > + value = blk_trace_str2mask(buf);
> >
> > Hmm, does this compile? Doesn't blk_trace_str2mask need two parameters?
>
> No, it needs one parameter.

I'm really confused now. The prototype for blk_trace_str2mask has
two parameters. It is used with two parameters above, but then it is used
here with only one parameter. The blk_str2act_mask may have one, and that
is what is replaced.

What am I missing?

-- Steve

2009-03-24 16:06:26

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 5/5] blktrace: print human-readable act_mask


On Tue, 24 Mar 2009, Steven Rostedt wrote:

>
>
>
> On Tue, 24 Mar 2009, Ingo Molnar wrote:
>
> >
> > * Steven Rostedt <[email protected]> wrote:
> >
> > >
> > >
> > > On Tue, 24 Mar 2009, Li Zefan wrote:
> > > >
> > > > +static ssize_t blk_trace_mask2str(char *buf, int mask)
>
> > > > else if (attr == &dev_attr_act_mask)
> > > > - ret = sprintf(buf, "%#x\n", q->blk_trace->act_mask);
> > > > + ret = blk_trace_mask2str(buf, q->blk_trace->act_mask);
>
>
> > > > - value = blk_str2act_mask(buf);
> > > > - if (value < 0)
> > > > + value = blk_trace_str2mask(buf);
> > >
> > > Hmm, does this compile? Doesn't blk_trace_str2mask need two parameters?
> >
> > No, it needs one parameter.
>
> I'm really confused now. The prototype for blk_trace_str2mask has
> two parameters. It is used with two parameters above, but then it is used
> here with only one parameter. The blk_str2act_mask may have one, and that
> is what is replaced.
>
> What am I missing?

I'm dyslexic :-p

I see now...

The proto-type is blk_trace_mask2str

but this is blk_trace_str2mask

Nevermind,

-- Steve

2009-03-25 01:21:41

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH 3/5] blktrace: fix off-by-one bug

Li Zefan wrote:
> 'what' is used as the index of array what2act, so it can't >= the array size.
>

I think this patch is needed no matter we support output of "abort"
event or not?

> Signed-off-by: Li Zefan <[email protected]>
> ---
> kernel/trace/blktrace.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 9af4143..0e91caa 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -1149,7 +1149,7 @@ static enum print_line_t blk_trace_event_print(struct trace_iterator *iter,
> if (!trace_print_context(iter))
> return TRACE_TYPE_PARTIAL_LINE;
>
> - if (unlikely(what == 0 || what > ARRAY_SIZE(what2act)))
> + if (unlikely(what == 0 || what >= ARRAY_SIZE(what2act)))
> ret = trace_seq_printf(s, "Bad pc action %x\n", what);
> else {
> const bool long_act = !!(trace_flags & TRACE_ITER_VERBOSE);
> @@ -1196,7 +1196,7 @@ static enum print_line_t blk_tracer_print_line(struct trace_iterator *iter)
> t = (const struct blk_io_trace *)iter->ent;
> what = t->action & ((1 << BLK_TC_SHIFT) - 1);
>
> - if (unlikely(what == 0 || what > ARRAY_SIZE(what2act)))
> + if (unlikely(what == 0 || what >= ARRAY_SIZE(what2act)))
> ret = trace_seq_printf(&iter->seq, "Bad pc action %x\n", what);
> else {
> const bool long_act = !!(trace_flags & TRACE_ITER_VERBOSE);

2009-03-25 02:54:52

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH 5/5] blktrace: print human-readable act_mask

>> I'm really confused now. The prototype for blk_trace_str2mask has
>> two parameters. It is used with two parameters above, but then it is used
>> here with only one parameter. The blk_str2act_mask may have one, and that
>> is what is replaced.
>>
>> What am I missing?
>
> I'm dyslexic :-p
>
> I see now...
>
> The proto-type is blk_trace_mask2str
>
> but this is blk_trace_str2mask
>
> Nevermind,
>

Never mind, your review is always appreciated. :)

2009-03-25 13:16:21

by Li Zefan

[permalink] [raw]
Subject: [tip:tracing/blktrace] blktrace: fix off-by-one bug

Commit-ID: 89b0d8901c251548ee4d835da184dd08af67a3a7
Gitweb: http://git.kernel.org/tip/89b0d8901c251548ee4d835da184dd08af67a3a7
Author: Li Zefan <[email protected]>
AuthorDate: Tue, 24 Mar 2009 16:05:27 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 25 Mar 2009 14:13:11 +0100

blktrace: fix off-by-one bug

'what' is used as the index of array what2act, so it can't >= the array size.

Signed-off-by: Li Zefan <[email protected]>
Acked-by: Jens Axboe <[email protected]>
Acked-by: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Steven Rostedt <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
kernel/trace/blktrace.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index a7f7ff5..d43cdac 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1152,7 +1152,7 @@ static enum print_line_t blk_trace_event_print(struct trace_iterator *iter,
if (!trace_print_context(iter))
return TRACE_TYPE_PARTIAL_LINE;

- if (unlikely(what == 0 || what > ARRAY_SIZE(what2act)))
+ if (unlikely(what == 0 || what >= ARRAY_SIZE(what2act)))
ret = trace_seq_printf(s, "Bad pc action %x\n", what);
else {
const bool long_act = !!(trace_flags & TRACE_ITER_VERBOSE);
@@ -1199,7 +1199,7 @@ static enum print_line_t blk_tracer_print_line(struct trace_iterator *iter)
t = (const struct blk_io_trace *)iter->ent;
what = t->action & ((1 << BLK_TC_SHIFT) - 1);

- if (unlikely(what == 0 || what > ARRAY_SIZE(what2act)))
+ if (unlikely(what == 0 || what >= ARRAY_SIZE(what2act)))
ret = trace_seq_printf(&iter->seq, "Bad pc action %x\n", what);
else {
const bool long_act = !!(trace_flags & TRACE_ITER_VERBOSE);