2007-06-18 04:58:39

by David Wilder

[permalink] [raw]
Subject: [PATCH] relay-file-read-start-pos-fix.patch


--
David Wilder
IBM Linux Technology Center
Beaverton, Oregon, USA
[email protected]
(503)578-3789


Attachments:
relay-file-read-start-pos-fix.patch (907.00 B)

Subject: Re: [PATCH] relay-file-read-start-pos-fix.patch

Hi David and Tom,

David Wilder wrote:
> This patch fixes a bug in the relay read interface causing the number
> of consumed bytes to be set incorrectly.

Thank you. Your patch fixes one of my concerns.
However there is another bug I found.
When I use relayfs with "overwrite" mode, read() still set incorrect
number of consumed bytes.
I tried to fix that. Please review it.

Signed-off-by: Masami Hiramatsu <[email protected]>

---
kernel/relay.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

Index: linux-2.6.22-rc4-mm2/kernel/relay.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/kernel/relay.c 2007-06-13 20:22:02.000000000 +0900
+++ linux-2.6.22-rc4-mm2/kernel/relay.c 2007-06-18 23:00:54.000000000 +0900
@@ -812,7 +812,10 @@
}

buf->bytes_consumed += bytes_consumed;
- read_subbuf = read_pos / buf->chan->subbuf_size;
+ if (!read_pos)
+ read_subbuf = buf->subbufs_consumed;
+ else
+ read_subbuf = read_pos / buf->chan->subbuf_size;
if (buf->bytes_consumed + buf->padding[read_subbuf] == subbuf_size) {
if ((read_subbuf == buf->subbufs_produced % n_subbufs) &&
(buf->offset == subbuf_size))
@@ -841,8 +844,9 @@
}

if (unlikely(produced - consumed >= n_subbufs)) {
- consumed = (produced / n_subbufs) * n_subbufs;
+ consumed = produced - n_subbufs + 1;
buf->subbufs_consumed = consumed;
+ buf->bytes_consumed = 0;
}

produced = (produced % n_subbufs) * subbuf_size + buf->offset;



2007-06-19 05:36:46

by Tom Zanussi

[permalink] [raw]
Subject: Re: [PATCH] relay-file-read-start-pos-fix.patch

On Tue, 2007-06-19 at 12:43 +0900, Masami Hiramatsu wrote:
> Hi David and Tom,
>
> David Wilder wrote:
> > This patch fixes a bug in the relay read interface causing the number
> > of consumed bytes to be set incorrectly.
>
> Thank you. Your patch fixes one of my concerns.
> However there is another bug I found.
> When I use relayfs with "overwrite" mode, read() still set incorrect
> number of consumed bytes.
> I tried to fix that. Please review it.

Hi,

Could you send more info on how to reproduce the problem you're seeing?
And does this patch fix it?

Thanks,

Tom


>
> Signed-off-by: Masami Hiramatsu <[email protected]>
>
> ---
> kernel/relay.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> Index: linux-2.6.22-rc4-mm2/kernel/relay.c
> ===================================================================
> --- linux-2.6.22-rc4-mm2.orig/kernel/relay.c 2007-06-13 20:22:02.000000000 +0900
> +++ linux-2.6.22-rc4-mm2/kernel/relay.c 2007-06-18 23:00:54.000000000 +0900
> @@ -812,7 +812,10 @@
> }
>
> buf->bytes_consumed += bytes_consumed;
> - read_subbuf = read_pos / buf->chan->subbuf_size;
> + if (!read_pos)
> + read_subbuf = buf->subbufs_consumed;
> + else
> + read_subbuf = read_pos / buf->chan->subbuf_size;
> if (buf->bytes_consumed + buf->padding[read_subbuf] == subbuf_size) {
> if ((read_subbuf == buf->subbufs_produced % n_subbufs) &&
> (buf->offset == subbuf_size))
> @@ -841,8 +844,9 @@
> }
>
> if (unlikely(produced - consumed >= n_subbufs)) {
> - consumed = (produced / n_subbufs) * n_subbufs;
> + consumed = produced - n_subbufs + 1;
> buf->subbufs_consumed = consumed;
> + buf->bytes_consumed = 0;
> }
>
> produced = (produced % n_subbufs) * subbuf_size + buf->offset;
>
>
>
>


2007-06-19 16:27:25

by Tom Zanussi

[permalink] [raw]
Subject: Re: [PATCH] relay-file-read-start-pos-fix.patch

On Tue, 2007-06-19 at 12:43 +0900, Masami Hiramatsu wrote:
> Hi David and Tom,
>
> David Wilder wrote:
> > This patch fixes a bug in the relay read interface causing the number
> > of consumed bytes to be set incorrectly.
>
> Thank you. Your patch fixes one of my concerns.
> However there is another bug I found.
> When I use relayfs with "overwrite" mode, read() still set incorrect
> number of consumed bytes.
> I tried to fix that. Please review it.

Hi,

I haven't had a chance to test it myself yet, but it looks ok to me,
except for one problem noted below...

Thanks for fixing it.

>
> Signed-off-by: Masami Hiramatsu <[email protected]>
>
> ---
> kernel/relay.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> Index: linux-2.6.22-rc4-mm2/kernel/relay.c
> ===================================================================
> --- linux-2.6.22-rc4-mm2.orig/kernel/relay.c 2007-06-13 20:22:02.000000000 +0900
> +++ linux-2.6.22-rc4-mm2/kernel/relay.c 2007-06-18 23:00:54.000000000 +0900
> @@ -812,7 +812,10 @@
> }
>
> buf->bytes_consumed += bytes_consumed;
> - read_subbuf = read_pos / buf->chan->subbuf_size;
> + if (!read_pos)
> + read_subbuf = buf->subbufs_consumed;

I think this should be instead:

+ read_subbuf = buf->subbufs_consumed % n_subbufs;

Tom


> + else
> + read_subbuf = read_pos / buf->chan->subbuf_size;
> if (buf->bytes_consumed + buf->padding[read_subbuf] == subbuf_size) {
> if ((read_subbuf == buf->subbufs_produced % n_subbufs) &&
> (buf->offset == subbuf_size))
> @@ -841,8 +844,9 @@
> }
>
> if (unlikely(produced - consumed >= n_subbufs)) {
> - consumed = (produced / n_subbufs) * n_subbufs;
> + consumed = produced - n_subbufs + 1;
> buf->subbufs_consumed = consumed;
> + buf->bytes_consumed = 0;
> }
>
> produced = (produced % n_subbufs) * subbuf_size + buf->offset;
>
>
>


Subject: Re: [PATCH] relay-file-read-start-pos-fix.patch

Tom Zanussi wrote:
> Hi,
>
> I haven't had a chance to test it myself yet, but it looks ok to me,
> except for one problem noted below...

Hi,

Thank you so much!
I'm preparing how it can reproduce. I'll send it as soon as possible.

> Thanks for fixing it.
>
>> Signed-off-by: Masami Hiramatsu <[email protected]>
>>
>> ---
>> kernel/relay.c | 8 ++++++--
>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> Index: linux-2.6.22-rc4-mm2/kernel/relay.c
>> ===================================================================
>> --- linux-2.6.22-rc4-mm2.orig/kernel/relay.c 2007-06-13 20:22:02.000000000 +0900
>> +++ linux-2.6.22-rc4-mm2/kernel/relay.c 2007-06-18 23:00:54.000000000 +0900
>> @@ -812,7 +812,10 @@
>> }
>>
>> buf->bytes_consumed += bytes_consumed;
>> - read_subbuf = read_pos / buf->chan->subbuf_size;
>> + if (!read_pos)
>> + read_subbuf = buf->subbufs_consumed;
>
> I think this should be instead:
>
> + read_subbuf = buf->subbufs_consumed % n_subbufs;

Yes, you are right.
Thank you again.

--
Masami HIRAMATSU
Linux Technology Center
Hitachi, Ltd., Systems Development Laboratory
E-mail: [email protected]

Subject: Re: [PATCH] relay-file-read-start-pos-fix.patch

Hi Tom,

Tom Zanussi wrote:
> Could you send more info on how to reproduce the problem you're seeing?
> And does this patch fix it?

Sure, I'll explain how to reproduce it.

Since current SystemTap is not supporting "overwrite" mode,
you need to apply a patch before trying to reproduce it.
I already posted the patch to bugzilla. You can get it from below.
http://sourceware.org/bugzilla/attachment.cgi?id=1896&action=view

Here is an example script (fillup.stp).
----
global counter=0
probe timer.ms(1) {
counter++;
printf("%08d : %020d\n", counter, gettimeofday_ns());
}
----

First of all, run the script with -O (overwrite mode) flag.
(For simplify my explanation, I also use -m flag here.)
$ stap -O fillup.stp -m fillup
Soon after starting, press ^\(Ctrl+\) to detach from it.

The script writes 32 bytes dummy data per 1 milli-second, so
it writes about 32k bytes per 1 second.
And the default size of relay channel of systemtap is 512kB
which contains 4 subbufs (each size of subbufs is 128kB).
Thus, it fills the relay channel at about 16 seconds and
wraparounds because it uses overwrite mode.

So, wait more than 16 seconds (for example, 18 sec),
read the relay channel and count the line number.
And repeat it.
$ while true; do sleep 18; \
cat /sys/kernel/debug/systemtap/fillup/trace0 | wc -l; done

Ideally, it will show the number from 12288(=3*128k/32) to
16384(=4*128k/32).

However, without my patch, it shows;
4793
5721
9818
9817
9819
13912
0
0
780
1625
5723
5721

And, with my patch;
15742
13273
14901
12430
14056
15682
13215
14840
12370
13996
15624
13154


So, I think my patch (which )fixes the problem.

Thanks,

P.S.
I attached my patch (relay-file-read-overwrite-mode-fix.patch)
which fixed the problem pointed in previous mail.

Signed-off-by: Masami Hiramatsu <[email protected]>

---
kernel/relay.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

Index: linux-2.6.22-rc4-mm2/kernel/relay.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/kernel/relay.c 2007-06-13 20:22:02.000000000 +0900
+++ linux-2.6.22-rc4-mm2/kernel/relay.c 2007-06-20 10:53:06.000000000 +0900
@@ -812,7 +812,10 @@
}

buf->bytes_consumed += bytes_consumed;
- read_subbuf = read_pos / buf->chan->subbuf_size;
+ if (!read_pos)
+ read_subbuf = buf->subbufs_consumed % n_subbufs;
+ else
+ read_subbuf = read_pos / buf->chan->subbuf_size;
if (buf->bytes_consumed + buf->padding[read_subbuf] == subbuf_size) {
if ((read_subbuf == buf->subbufs_produced % n_subbufs) &&
(buf->offset == subbuf_size))
@@ -841,8 +844,9 @@
}

if (unlikely(produced - consumed >= n_subbufs)) {
- consumed = (produced / n_subbufs) * n_subbufs;
+ consumed = produced - n_subbufs + 1;
buf->subbufs_consumed = consumed;
+ buf->bytes_consumed = 0;
}

produced = (produced % n_subbufs) * subbuf_size + buf->offset;




2007-06-20 14:52:40

by Tom Zanussi

[permalink] [raw]
Subject: Re: [PATCH] relay-file-read-start-pos-fix.patch

On Wed, 2007-06-20 at 17:31 +0900, Masami Hiramatsu wrote:

[...]

> P.S.
> I attached my patch (relay-file-read-overwrite-mode-fix.patch)
> which fixed the problem pointed in previous mail.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
>

Hi,

Thanks for sending the test case. I wrote my own program that tests for
the same thing, and it also works fine. To make sure it doesn't break
no-overwrite mode, I also ran several tests with blktrace and that looks
good too.

Thanks very much for analyzing the problem and providing the patch!

Just to summarize - the problem this fixes is that in overwrite mode,
the current read code doesn't pick up all the data it could if the whole
buffer is filled - it will leave behind sub-buffers at the beginning.
With this patch, the data in those sub-buffers is read as well.

Acked-by: Tom Zanussi <[email protected]>

> ---
> kernel/relay.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> Index: linux-2.6.22-rc4-mm2/kernel/relay.c
> ===================================================================
> --- linux-2.6.22-rc4-mm2.orig/kernel/relay.c 2007-06-13 20:22:02.000000000 +0900
> +++ linux-2.6.22-rc4-mm2/kernel/relay.c 2007-06-20 10:53:06.000000000 +0900
> @@ -812,7 +812,10 @@
> }
>
> buf->bytes_consumed += bytes_consumed;
> - read_subbuf = read_pos / buf->chan->subbuf_size;
> + if (!read_pos)
> + read_subbuf = buf->subbufs_consumed % n_subbufs;
> + else
> + read_subbuf = read_pos / buf->chan->subbuf_size;
> if (buf->bytes_consumed + buf->padding[read_subbuf] == subbuf_size) {
> if ((read_subbuf == buf->subbufs_produced % n_subbufs) &&
> (buf->offset == subbuf_size))
> @@ -841,8 +844,9 @@
> }
>
> if (unlikely(produced - consumed >= n_subbufs)) {
> - consumed = (produced / n_subbufs) * n_subbufs;
> + consumed = produced - n_subbufs + 1;
> buf->subbufs_consumed = consumed;
> + buf->bytes_consumed = 0;
> }
>
> produced = (produced % n_subbufs) * subbuf_size + buf->offset;
>
>
>
>


2007-06-21 18:06:59

by David Wilder

[permalink] [raw]
Subject: Re: [PATCH] relay-file-read-start-pos-fix.patch

Ack
Works for me. Thanks.

Note:
Both Masami's patch and the relay-file-read-start-pos-fix.patch I posted
earlier are required.

Masami Hiramatsu wrote:

>Hi Tom,
>
>Tom Zanussi wrote:
>
>
>>Could you send more info on how to reproduce the problem you're seeing?
>>And does this patch fix it?
>>
>>
>
>Sure, I'll explain how to reproduce it.
>
>Since current SystemTap is not supporting "overwrite" mode,
>you need to apply a patch before trying to reproduce it.
>I already posted the patch to bugzilla. You can get it from below.
>http://sourceware.org/bugzilla/attachment.cgi?id=1896&action=view
>
>Here is an example script (fillup.stp).
>----
>global counter=0
>probe timer.ms(1) {
> counter++;
> printf("%08d : %020d\n", counter, gettimeofday_ns());
>}
>----
>
>First of all, run the script with -O (overwrite mode) flag.
>(For simplify my explanation, I also use -m flag here.)
>$ stap -O fillup.stp -m fillup
>Soon after starting, press ^\(Ctrl+\) to detach from it.
>
>The script writes 32 bytes dummy data per 1 milli-second, so
>it writes about 32k bytes per 1 second.
>And the default size of relay channel of systemtap is 512kB
>which contains 4 subbufs (each size of subbufs is 128kB).
>Thus, it fills the relay channel at about 16 seconds and
>wraparounds because it uses overwrite mode.
>
>So, wait more than 16 seconds (for example, 18 sec),
>read the relay channel and count the line number.
>And repeat it.
>$ while true; do sleep 18; \
> cat /sys/kernel/debug/systemtap/fillup/trace0 | wc -l; done
>
>Ideally, it will show the number from 12288(=3*128k/32) to
>16384(=4*128k/32).
>
>However, without my patch, it shows;
>4793
>5721
>9818
>9817
>9819
>13912
>0
>0
>780
>1625
>5723
>5721
>
>And, with my patch;
>15742
>13273
>14901
>12430
>14056
>15682
>13215
>14840
>12370
>13996
>15624
>13154
>
>
>So, I think my patch (which )fixes the problem.
>
>Thanks,
>
>P.S.
>I attached my patch (relay-file-read-overwrite-mode-fix.patch)
>which fixed the problem pointed in previous mail.
>
>Signed-off-by: Masami Hiramatsu <[email protected]>
>
>---
> kernel/relay.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
>Index: linux-2.6.22-rc4-mm2/kernel/relay.c
>===================================================================
>--- linux-2.6.22-rc4-mm2.orig/kernel/relay.c 2007-06-13 20:22:02.000000000 +0900
>+++ linux-2.6.22-rc4-mm2/kernel/relay.c 2007-06-20 10:53:06.000000000 +0900
>@@ -812,7 +812,10 @@
> }
>
> buf->bytes_consumed += bytes_consumed;
>- read_subbuf = read_pos / buf->chan->subbuf_size;
>+ if (!read_pos)
>+ read_subbuf = buf->subbufs_consumed % n_subbufs;
>+ else
>+ read_subbuf = read_pos / buf->chan->subbuf_size;
> if (buf->bytes_consumed + buf->padding[read_subbuf] == subbuf_size) {
> if ((read_subbuf == buf->subbufs_produced % n_subbufs) &&
> (buf->offset == subbuf_size))
>@@ -841,8 +844,9 @@
> }
>
> if (unlikely(produced - consumed >= n_subbufs)) {
>- consumed = (produced / n_subbufs) * n_subbufs;
>+ consumed = produced - n_subbufs + 1;
> buf->subbufs_consumed = consumed;
>+ buf->bytes_consumed = 0;
> }
>
> produced = (produced % n_subbufs) * subbuf_size + buf->offset;
>
>
>
>
>
>
>


--
David Wilder
IBM Linux Technology Center
Beaverton, Oregon, USA
[email protected]
(503)578-3789