This fixes a scenario in which trace_pipe_raw will return events with invalid
timestamps when events are copied out one at a time (instead of swapping out
the reader page with a spare page). In this scenario, ring_buffer_read_page()
uses cpu_buffer->read_stamp to keep track of the time of the earliest event.
However, cpu_buffer->read_stamp was not always updated. The only function that
sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
only by rb_get_reader_page(). rb_reset_reader_page() was not called when there
is data immediately available on the page to read (read < rb_page_size()). This
is the bug.
Setting the read_stamp on the first read from a page fixes the bug.
Tested: On certain lightly-loaded machines, repetitive reads from
trace_pipe_raw without using splice() would sometimes result in invalid
timestamps. Poisoning read_stamp in rb_init_page() with a negative value makes
the problem more visible. After this fix, the invalid timstamps disappear.
Google-Bug-Id: 6410455
Signed-off-by: David Sharp <[email protected]>
---
kernel/trace/ring_buffer.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 1d0f6a8..ad0239b 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3226,8 +3226,15 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
reader = cpu_buffer->reader_page;
/* If there's more to read, return this page */
- if (cpu_buffer->reader_page->read < rb_page_size(reader))
+ if (cpu_buffer->reader_page->read < rb_page_size(reader)) {
+ /*
+ * If this is the first read from this page,
+ * initialize the read timestamp.
+ */
+ if (cpu_buffer->reader_page->read == 0)
+ cpu_buffer->read_stamp = reader->page->time_stamp;
goto out;
+ }
/* Never should we have an index greater than the size */
if (RB_WARN_ON(cpu_buffer,
--
1.7.7.3
On Thu, Jun 7, 2012 at 4:27 PM, David Sharp <[email protected]> wrote:
> This fixes a scenario in which trace_pipe_raw will return events with invalid
> timestamps when events are copied out one at a time (instead of swapping out
> the reader page with a spare page). In this scenario, ring_buffer_read_page()
> uses cpu_buffer->read_stamp to keep track of the time of the earliest event.
> However, cpu_buffer->read_stamp was not always updated. The only function that
> sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
> only by rb_get_reader_page(). rb_reset_reader_page() was not called when there
> is data immediately available on the page to read (read < rb_page_size()). This
> is the bug.
>
> Setting the read_stamp on the first read from a page fixes the bug.
>
> Tested: On certain lightly-loaded machines, repetitive reads from
> trace_pipe_raw without using splice() would sometimes result in invalid
> timestamps. Poisoning read_stamp in rb_init_page() with a negative value makes
> the problem more visible. After this fix, the invalid timstamps disappear.
err, that's poisoning the bpage->time_stamp, not read_stamp.
>
> Google-Bug-Id: 6410455
> Signed-off-by: David Sharp <[email protected]>
This fixes a scenario in which trace_pipe_raw will return events with invalid
timestamps when events are copied out one at a time (instead of swapping out
the reader page with a spare page). In this scenario, ring_buffer_read_page()
uses cpu_buffer->read_stamp to keep track of the time of the earliest event.
However, cpu_buffer->read_stamp was not always updated. The only function that
sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
only by rb_get_reader_page(). rb_reset_reader_page() was not called when there
is data immediately available on the page to read (read < rb_page_size()). This
is the bug.
Setting the read_stamp on the first read from a page fixes the bug.
Tested: On certain lightly-loaded machines, repetitive reads from
trace_pipe_raw without using splice() would sometimes result in invalid
timestamps. Poisoning bpage->timestamp in rb_init_page() with a negative value
makes the problem more visible. After this fix, the invalid timstamps
disappear.
Google-Bug-Id: 6410455
Signed-off-by: David Sharp <[email protected]>
---
Steve, did you see this? It fixes a data-accuracy bug in ftrace. Also taking
this reminder as an opportunity to correct that typo in the description.
kernel/trace/ring_buffer.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 1d0f6a8..ad0239b 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3226,8 +3226,15 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
reader = cpu_buffer->reader_page;
/* If there's more to read, return this page */
- if (cpu_buffer->reader_page->read < rb_page_size(reader))
+ if (cpu_buffer->reader_page->read < rb_page_size(reader)) {
+ /*
+ * If this is the first read from this page,
+ * initialize the read timestamp.
+ */
+ if (cpu_buffer->reader_page->read == 0)
+ cpu_buffer->read_stamp = reader->page->time_stamp;
goto out;
+ }
/* Never should we have an index greater than the size */
if (RB_WARN_ON(cpu_buffer,
--
1.7.7.3
On Mon, 2012-06-18 at 16:02 -0700, David Sharp wrote:
> This fixes a scenario in which trace_pipe_raw will return events with invalid
> timestamps when events are copied out one at a time (instead of swapping out
> the reader page with a spare page). In this scenario, ring_buffer_read_page()
> uses cpu_buffer->read_stamp to keep track of the time of the earliest event.
> However, cpu_buffer->read_stamp was not always updated. The only function that
> sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
> only by rb_get_reader_page(). rb_reset_reader_page() was not called when there
> is data immediately available on the page to read (read < rb_page_size()). This
> is the bug.
Hi David,
Thanks for the reminder, I'll be looking at it today.
Do you believe that this is an urgent fix and should be marked for
stable, or do you think it can wait till 3.6? If you think it should be
marked for stable, then it should be pushed for 3.5.
-- Steve
OK, I did look at this more.
On Mon, 2012-06-18 at 16:02 -0700, David Sharp wrote:
> This fixes a scenario in which trace_pipe_raw will return events with invalid
> timestamps when events are copied out one at a time (instead of swapping out
> the reader page with a spare page). In this scenario, ring_buffer_read_page()
> uses cpu_buffer->read_stamp to keep track of the time of the earliest event.
> However, cpu_buffer->read_stamp was not always updated.
> The only function that
> sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
> only by rb_get_reader_page().
Correct, which is the only way to get the reader page no matter how you
read the buffer.
> rb_reset_reader_page() was not called when there
> is data immediately available on the page to read (read < rb_page_size()). This
> is the bug.
It is not called? but how did you get the reader_page without the swap
in the first place?
When the ring buffer is first allocated the reader page is set to a
zero'd page, making the "read" and "commit" both zero.
The first time rb_get_reader_page() is called, the compare of read <
rb_page_size() (which is the commit field) fails (0 < 0 is false).
A swap from the writable ring buffer takes place and the
cpu_buffer->read_stamp is updated.
Ah! I think this is where the bug you see happens. But your analysis is
flawed.
If the ring buffer is currently empty, we swap an empty page for an
empty page. But the writer ends up on the reader page preventing new
swaps from taking place.
commit_page == reader_page
If the writer starts writing, it will update the time stamp of the page.
If your next read happens now, and it's just a single read, then you are
correct that it will not update the read_stamp.
I'm wondering if it would be better to just not do the swap, and return
NULL when it is empty. This would also fix the problem, as the
read_stamp will only be set when you actually have data.
Or it may just be simpler to take your patch.
-- Steve
On Thu, 2012-06-21 at 11:46 -0400, Steven Rostedt wrote:
> I'm wondering if it would be better to just not do the swap, and return
> NULL when it is empty. This would also fix the problem, as the
> read_stamp will only be set when you actually have data.
Does something like this work for you. Note, this is totally untested!
-- Steve
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index ad0239b..5943044 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3246,6 +3246,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
if (cpu_buffer->commit_page == cpu_buffer->reader_page)
goto out;
+ /* Don't bother swapping if the ring buffer is empty */
+ if (rb_num_of_entries(cpu_buffer) == 0)
+ goto out;
+
/*
* Reset the reader page to size zero.
*/
On Thu, Jun 21, 2012 at 7:52 AM, Steven Rostedt <[email protected]> wrote:
> Do you believe that this is an urgent fix and should be marked for
> stable, or do you think it can wait till 3.6? If you think it should be
> marked for stable, then it should be pushed for 3.5.
I think it should be considered for 3.5, since it affects the accuracy
of the timestamps in very roughly 20% of traces read with read() on
trace_pipe_raw. otoh, it only affects the first page or so on lightly
loaded CPUs. Personally it doesn't make much difference to us what
release it gets into.
On Thu, Jun 21, 2012 at 8:46 AM, Steven Rostedt <[email protected]> wrote:
> OK, I did look at this more.
>> The only function that
>> sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
>> only by rb_get_reader_page().
>
> Correct, which is the only way to get the reader page no matter how you
> read the buffer.
>
>> rb_reset_reader_page() was not called when there
>> is data immediately available on the page to read (read < rb_page_size()). This
>> is the bug.
>
> It is not called?
Correct. I was able to add WARN_ONs that showed that it reached "out:"
without reaching rb_reset_reader_page() first while still having a
poison value in cpu_buffer->read_stamp.
> but how did you get the reader_page without the swap
> in the first place?
>
> When the ring buffer is first allocated the reader page is set to a
> zero'd page, making the "read" and "commit" both zero.
>
> The first time rb_get_reader_page() is called, the compare of read <
> rb_page_size() (which is the commit field) fails (0 < 0 is false).
>
> A swap from the writable ring buffer takes place and the
> cpu_buffer->read_stamp is updated.
>
> Ah! I think this is where the bug you see happens. But your analysis is
> flawed.
I admit I didn't look into it that closely, so flaws are quite
possible (also, I'm human).
>
> If the ring buffer is currently empty, we swap an empty page for an
> empty page. But the writer ends up on the reader page preventing new
> swaps from taking place.
>
> commit_page == reader_page
>
> If the writer starts writing, it will update the time stamp of the page.
> If your next read happens now, and it's just a single read, then you are
> correct that it will not update the read_stamp.
>
> I'm wondering if it would be better to just not do the swap, and return
> NULL when it is empty. This would also fix the problem, as the
> read_stamp will only be set when you actually have data.
But we do have data... That's how we know we're getting invalid timestamps.
I don't quite understand what you're describing. Here's what I think
is happening though:
When the ring buffer is reset, commit_page == tail_page == head_page.
rb_get_reader_page() will pick up the head page. Then a few (less than
1 page worth) writes happen, on the tail_page which is currently (or
soon to be) also the reader_page. Now read == 0, but
rb_page_size(reader) is however many bytes have been written to the
page.
So we have valid data on the reader page, but read_stamp has not been set yet.
>
> Or it may just be simpler to take your patch.
Please? :)
On Thu, Jun 21, 2012 at 8:56 AM, Steven Rostedt <[email protected]> wrote:
> Does something like this work for you. Note, this is totally untested!
>
> -- Steve
>
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index ad0239b..5943044 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -3246,6 +3246,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
> if (cpu_buffer->commit_page == cpu_buffer->reader_page)
> goto out;
>
> + /* Don't bother swapping if the ring buffer is empty */
This doesn't make sense, since the ring buffer isn't empty in this
scenario. (I didn't bother testing it.)
> + if (rb_num_of_entries(cpu_buffer) == 0)
Did you mean rb_page_entries(something?)
> + goto out;
> +
> /*
> * Reset the reader page to size zero.
> */
>
>
On Fri, 2012-06-22 at 13:50 -0700, David Sharp wrote:
> On Thu, Jun 21, 2012 at 7:52 AM, Steven Rostedt <[email protected]> wrote:
> > Do you believe that this is an urgent fix and should be marked for
> > stable, or do you think it can wait till 3.6? If you think it should be
> > marked for stable, then it should be pushed for 3.5.
>
> I think it should be considered for 3.5, since it affects the accuracy
> of the timestamps in very roughly 20% of traces read with read() on
> trace_pipe_raw. otoh, it only affects the first page or so on lightly
> loaded CPUs. Personally it doesn't make much difference to us what
> release it gets into.
If there's a simple fix then I can try to get this into 3.5.
>
> On Thu, Jun 21, 2012 at 8:46 AM, Steven Rostedt <[email protected]> wrote:
> > OK, I did look at this more.
>
> >> The only function that
> >> sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
> >> only by rb_get_reader_page().
> >
> > Correct, which is the only way to get the reader page no matter how you
> > read the buffer.
> >
> >> rb_reset_reader_page() was not called when there
> >> is data immediately available on the page to read (read < rb_page_size()). This
> >> is the bug.
> >
> > It is not called?
>
> Correct. I was able to add WARN_ONs that showed that it reached "out:"
> without reaching rb_reset_reader_page() first while still having a
> poison value in cpu_buffer->read_stamp.
NO! You didn't understand what I wrote. The writer can not get onto the
reader page without a swap! The reader page is not part of the main ring
buffer.
>
> > but how did you get the reader_page without the swap
> > in the first place?
> >
> > When the ring buffer is first allocated the reader page is set to a
> > zero'd page, making the "read" and "commit" both zero.
> >
> > The first time rb_get_reader_page() is called, the compare of read <
> > rb_page_size() (which is the commit field) fails (0 < 0 is false).
> >
> > A swap from the writable ring buffer takes place and the
> > cpu_buffer->read_stamp is updated.
> >
> > Ah! I think this is where the bug you see happens. But your analysis is
> > flawed.
>
> I admit I didn't look into it that closely, so flaws are quite
> possible (also, I'm human).
I didn't mean to offend, as this is not something that can be easily
picked up.
>
> >
> > If the ring buffer is currently empty, we swap an empty page for an
> > empty page. But the writer ends up on the reader page preventing new
> > swaps from taking place.
> >
> > commit_page == reader_page
> >
> > If the writer starts writing, it will update the time stamp of the page.
> > If your next read happens now, and it's just a single read, then you are
> > correct that it will not update the read_stamp.
> >
> > I'm wondering if it would be better to just not do the swap, and return
> > NULL when it is empty. This would also fix the problem, as the
> > read_stamp will only be set when you actually have data.
>
> But we do have data... That's how we know we're getting invalid timestamps.
I bet this never happens if you add a little data and then do a read. I
only happens when you read from an empty ring buffer, then do a small
write, and then read again. That's where it will get you.
>
> I don't quite understand what you're describing. Here's what I think
> is happening though:
>
> When the ring buffer is reset, commit_page == tail_page == head_page.
And at this moment, the writer is off the reader page and will not be on
the reader page. The only way to get date at this point is if you do a
swap.
> rb_get_reader_page() will pick up the head page.
Right. But this requires a read to happen. If you never do a read, then
rb_get_reader_page will not pick up the head page. The problem is when
you read from an empty ring buffer.
> Then a few (less than
> 1 page worth) writes happen, on the tail_page which is currently (or
> soon to be) also the reader_page. Now read == 0, but
> rb_page_size(reader) is however many bytes have been written to the
> page.
Correct.
>
> So we have valid data on the reader page, but read_stamp has not been set yet.
Yep
>
> >
> > Or it may just be simpler to take your patch.
>
> Please? :)
Now I think you may understand my patch.
>
> On Thu, Jun 21, 2012 at 8:56 AM, Steven Rostedt <[email protected]> wrote:
> > Does something like this work for you. Note, this is totally untested!
> >
> > -- Steve
> >
> > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> > index ad0239b..5943044 100644
> > --- a/kernel/trace/ring_buffer.c
> > +++ b/kernel/trace/ring_buffer.c
> > @@ -3246,6 +3246,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
> > if (cpu_buffer->commit_page == cpu_buffer->reader_page)
> > goto out;
> >
> > + /* Don't bother swapping if the ring buffer is empty */
>
> This doesn't make sense, since the ring buffer isn't empty in this
> scenario. (I didn't bother testing it.)
Yes it is! If you never did a read on the empty buffer, the *only* way
to get what was written is to do the swap.
The problem you are seeing is that you did a read on an empty buffer
before writing. In this case the swap happens but all we have is an
empty page with the write pointer (commit and tail) on it. When a write
happens, and you do another read, *then* we do not do the swap and you
get the stale time stamp.
Thus, if you never let the first swap happen, you wont get into this
scenario that you read and do not update the read_stamp.
>
> > + if (rb_num_of_entries(cpu_buffer) == 0)
>
> Did you mean rb_page_entries(something?)
Nope, I meant the actual ring buffer. If the ring buffer is empty, do
not perform the swap. Because we only update the read stamp when we
perform a swap, and it takes a write to happen before the page's
timestamp is updated. If we do the swap before any writes happen, we get
an invalid timestamp from the page.
Now do you understand?
-- Steve
>
> > + goto out;
> > +
> > /*
> > * Reset the reader page to size zero.
> > */
> >
> >
On Fri, Jun 22, 2012 at 4:31 PM, Steven Rostedt <[email protected]> wrote:
> On Fri, 2012-06-22 at 13:50 -0700, David Sharp wrote:
>> On Thu, Jun 21, 2012 at 8:46 AM, Steven Rostedt <[email protected]> wrote:
>> > OK, I did look at this more.
>>
>> >> The only function that
>> >> sets read_stamp to a new valid value is rb_reset_reader_page(), which is called
>> >> only by rb_get_reader_page().
>> >
>> > Correct, which is the only way to get the reader page no matter how you
>> > read the buffer.
>> >
>> >> rb_reset_reader_page() was not called when there
>> >> is data immediately available on the page to read (read < rb_page_size()). This
>> >> is the bug.
>> >
>> > It is not called?
>>
>> Correct. I was able to add WARN_ONs that showed that it reached "out:"
>> without reaching rb_reset_reader_page() first while still having a
>> poison value in cpu_buffer->read_stamp.
>
> NO! You didn't understand what I wrote. The writer can not get onto the
> reader page without a swap! The reader page is not part of the main ring
> buffer.
I understand that. I just thought the writer was getting onto the
reader page during a swap in the first read, the same read in which we
were getting invalid timestamps. But that doesn't make sense.
>> > Ah! I think this is where the bug you see happens. But your analysis is
>> > flawed.
>>
>> I admit I didn't look into it that closely, so flaws are quite
>> possible (also, I'm human).
>
> I didn't mean to offend, as this is not something that can be easily
> picked up.
None taken. I only meant I wasn't as rigorous as I could have been.
>> > If the ring buffer is currently empty, we swap an empty page for an
>> > empty page. But the writer ends up on the reader page preventing new
>> > swaps from taking place.
>> >
>> > commit_page == reader_page
>> >
>> > If the writer starts writing, it will update the time stamp of the page.
>> > If your next read happens now, and it's just a single read, then you are
>> > correct that it will not update the read_stamp.
>> >
>> > I'm wondering if it would be better to just not do the swap, and return
>> > NULL when it is empty. This would also fix the problem, as the
>> > read_stamp will only be set when you actually have data.
>>
>> But we do have data... That's how we know we're getting invalid timestamps.
>
> I bet this never happens if you add a little data and then do a read. I
> only happens when you read from an empty ring buffer, then do a small
> write, and then read again. That's where it will get you.
Ah, see what you mean now. That seems possible.
>
>>
>> I don't quite understand what you're describing. Here's what I think
>> is happening though:
>>
>> When the ring buffer is reset, commit_page == tail_page == head_page.
>
> And at this moment, the writer is off the reader page and will not be on
> the reader page. The only way to get date at this point is if you do a
> swap.
>
>> rb_get_reader_page() will pick up the head page.
>
> Right. But this requires a read to happen. If you never do a read, then
> rb_get_reader_page will not pick up the head page. The problem is when
> you read from an empty ring buffer.
>
>
>> Then a few (less than
>> 1 page worth) writes happen, on the tail_page which is currently (or
>> soon to be) also the reader_page. Now read == 0, but
>> rb_page_size(reader) is however many bytes have been written to the
>> page.
>
> Correct.
>
>>
>> So we have valid data on the reader page, but read_stamp has not been set yet.
>
> Yep
>
>>
>> >
>> > Or it may just be simpler to take your patch.
>>
>> Please? :)
>
> Now I think you may understand my patch.
Yeah, mostly. At least enough that I think it's worth testing. But Monday.
>> On Thu, Jun 21, 2012 at 8:56 AM, Steven Rostedt <[email protected]> wrote:
>> > Does something like this work for you. Note, this is totally untested!
>> >
>> > -- Steve
>> >
>> > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
>> > index ad0239b..5943044 100644
>> > --- a/kernel/trace/ring_buffer.c
>> > +++ b/kernel/trace/ring_buffer.c
>> > @@ -3246,6 +3246,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
>> > if (cpu_buffer->commit_page == cpu_buffer->reader_page)
>> > goto out;
>> >
>> > + /* Don't bother swapping if the ring buffer is empty */
>>
>> This doesn't make sense, since the ring buffer isn't empty in this
>> scenario. (I didn't bother testing it.)
>
> Yes it is! If you never did a read on the empty buffer, the *only* way
> to get what was written is to do the swap.
>
> The problem you are seeing is that you did a read on an empty buffer
> before writing. In this case the swap happens but all we have is an
> empty page with the write pointer (commit and tail) on it. When a write
> happens, and you do another read, *then* we do not do the swap and you
> get the stale time stamp.
>
> Thus, if you never let the first swap happen, you wont get into this
> scenario that you read and do not update the read_stamp.
>
>>
>> > + if (rb_num_of_entries(cpu_buffer) == 0)
>>
>> Did you mean rb_page_entries(something?)
>
> Nope, I meant the actual ring buffer. If the ring buffer is empty, do
> not perform the swap. Because we only update the read stamp when we
> perform a swap, and it takes a write to happen before the page's
> timestamp is updated. If we do the swap before any writes happen, we get
> an invalid timestamp from the page.
Oops, I only asked because I'm blind, or can't type, or something,
because I couldn't find rb_num_of_entries().
>
> Now do you understand?
>
> -- Steve
>
>>
>> > + goto out;
>> > +
>> > /*
>> > * Reset the reader page to size zero.
>> > */
>> >
>> >
>
>
On Fri, Jun 22, 2012 at 6:27 PM, David Sharp <[email protected]> wrote:
> On Fri, Jun 22, 2012 at 4:31 PM, Steven Rostedt <[email protected]> wrote:
>>
>> Now I think you may understand my patch.
>
> Yeah, mostly. At least enough that I think it's worth testing. But Monday.
I got around to testing your patch today, and it fixes the issue. No
bad-looking timestamps in 40 runs, whereas could reproduce within 3
runs before. Do you want me to send you a fresh patch, or just use the
one you have?
>>> On Thu, Jun 21, 2012 at 8:56 AM, Steven Rostedt <[email protected]> wrote:
>>> > Does something like this work for you. Note, this is totally untested!
>>> >
>>> > -- Steve
>>> >
>>> > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
>>> > index ad0239b..5943044 100644
>>> > --- a/kernel/trace/ring_buffer.c
>>> > +++ b/kernel/trace/ring_buffer.c
>>> > @@ -3246,6 +3246,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
>>> > if (cpu_buffer->commit_page == cpu_buffer->reader_page)
>>> > goto out;
>>> >
>>> > + /* Don't bother swapping if the ring buffer is empty */
>>> > + if (rb_num_of_entries(cpu_buffer) == 0)
>>> > + goto out;
>>> > +
>>> > /*
>>> > * Reset the reader page to size zero.
>>> > */
On Tue, 2012-06-26 at 17:35 -0700, David Sharp wrote:
> On Fri, Jun 22, 2012 at 6:27 PM, David Sharp <[email protected]> wrote:
> > On Fri, Jun 22, 2012 at 4:31 PM, Steven Rostedt <[email protected]> wrote:
> >>
> >> Now I think you may understand my patch.
> >
> > Yeah, mostly. At least enough that I think it's worth testing. But Monday.
>
> I got around to testing your patch today, and it fixes the issue. No
> bad-looking timestamps in 40 runs, whereas could reproduce within 3
> runs before. Do you want me to send you a fresh patch, or just use the
> one you have?
I rather use this one for two reasons.
1) it limits the places where read_stamp is updated. And I rather not
add an update because "it fixes an anomaly".
2) I think it is wrong to force the writer on the reader page when no
write has been made. There's some side effects that this causes. One is
that if you do a read with no write, and then do nothing, it forces the
writer on that page. Now if a lot of writes happen (function tracing),
the writes that were on the reader page are never overwritten when the
buffer is full. Then you get a page of very old data, followed by a
buffer full of new data.
-- Steve
>
> >>> On Thu, Jun 21, 2012 at 8:56 AM, Steven Rostedt <[email protected]> wrote:
> >>> > Does something like this work for you. Note, this is totally untested!
> >>> >
> >>> > -- Steve
> >>> >
> >>> > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> >>> > index ad0239b..5943044 100644
> >>> > --- a/kernel/trace/ring_buffer.c
> >>> > +++ b/kernel/trace/ring_buffer.c
> >>> > @@ -3246,6 +3246,10 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
> >>> > if (cpu_buffer->commit_page == cpu_buffer->reader_page)
> >>> > goto out;
> >>> >
> >>> > + /* Don't bother swapping if the ring buffer is empty */
> >>> > + if (rb_num_of_entries(cpu_buffer) == 0)
> >>> > + goto out;
> >>> > +
> >>> > /*
> >>> > * Reset the reader page to size zero.
> >>> > */
On Wed, Jun 27, 2012 at 5:46 AM, Steven Rostedt <[email protected]> wrote:
> On Tue, 2012-06-26 at 17:35 -0700, David Sharp wrote:
> > On Fri, Jun 22, 2012 at 6:27 PM, David Sharp <[email protected]> wrote:
> > > On Fri, Jun 22, 2012 at 4:31 PM, Steven Rostedt <[email protected]> wrote:
> > >>
> > >> Now I think you may understand my patch.
> > >
> > > Yeah, mostly. At least enough that I think it's worth testing. But Monday.
> >
> > I got around to testing your patch today, and it fixes the issue. No
> > bad-looking timestamps in 40 runs, whereas could reproduce within 3
> > runs before. Do you want me to send you a fresh patch, or just use the
> > one you have?
>
> I rather use this one for two reasons.
I just meant, do you want me to send you a version of your patch with
my description (I'll update it, obviously), or do it yourself.
>
> 1) it limits the places where read_stamp is updated. And I rather not
> add an update because "it fixes an anomaly".
>
> 2) I think it is wrong to force the writer on the reader page when no
> write has been made. There's some side effects that this causes. One is
> that if you do a read with no write, and then do nothing, it forces the
> writer on that page. Now if a lot of writes happen (function tracing),
> the writes that were on the reader page are never overwritten when the
> buffer is full. Then you get a page of very old data, followed by a
> buffer full of new data.
I have always disliked that page of very old data, so I'm really happy
that this will get rid of it. As I recall, you once claimed this was a
"feature". :)
On Wed, 2012-06-27 at 11:00 -0700, David Sharp wrote:
> I just meant, do you want me to send you a version of your patch with
> my description (I'll update it, obviously), or do it yourself.
I'll do it. Although, I think this may just go for 3.6 and not 3.5, as
the bug is really minor.
> I have always disliked that page of very old data, so I'm really happy
> that this will get rid of it. As I recall, you once claimed this was a
> "feature". :)
Of course it was a feature. But we'll just act like Gnome and remove
it ;-)
-- Steve