2013-03-18 13:33:33

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v2] perf: fix ring_buffer perf_output_space() boundary calculation


This patch fixes a flaw in perf_output_space(). In case the size
of the space needed is bigger than the actual buffer size, there
may be situations where the function would return true (i.e., there
is space) when it should not. head > offset due to rounding of the
masking logic.

The problem can be tested by activating BTS on Intel processors.
A BTS record can be as big as 16 pages. The following command
fails:

$ perf record -m 4 -c 1 -e branches:u my_test_program

You will get a buffer corruption with this. Perf report won't be
able to parse the perf.data.

The fix is to first check that the requested space is smaller than the
buffer size. If so, then the masking logic will work fine. If not, then
there is no chance the record can be saved and it will be gracefully handled
by upper code layers.

In v2, we also make the logic for the writable more explicit by
renaming it to rb->overwrite because it tells whether or not the
buffer can overwrite its tail (suggested by PeterZ).

Signed-off-by: Stephane Eranian <[email protected]>
---

diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index d56a64c..eb675c4 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -16,7 +16,7 @@ struct ring_buffer {
int page_order; /* allocation order */
#endif
int nr_pages; /* nr of data pages */
- int writable; /* are we writable */
+ int overwrite; /* can overwrite itself */

atomic_t poll; /* POLL_ for wakeups */

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 23cb34f..f52d1e7 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -18,12 +18,24 @@
static bool perf_output_space(struct ring_buffer *rb, unsigned long tail,
unsigned long offset, unsigned long head)
{
- unsigned long mask;
+ unsigned long sz = perf_data_size(rb);
+ unsigned long mask = sz - 1;

- if (!rb->writable)
+ /*
+ * check if user-writable
+ * overwrite : over-write its own tail
+ * !overwrite: buffer possibly drops events.
+ */
+ if (rb->overwrite)
return true;

- mask = perf_data_size(rb) - 1;
+ /*
+ * verify that payload is not bigger than buffer
+ * otherwise masking logic may fail to detect
+ * the "not enough space" condition
+ */
+ if ((head - offset) > sz)
+ return false;

offset = (offset - tail) & mask;
head = (head - tail) & mask;
@@ -212,7 +224,9 @@ ring_buffer_init(struct ring_buffer *rb, long watermark, int flags)
rb->watermark = max_size / 2;

if (flags & RING_BUFFER_WRITABLE)
- rb->writable = 1;
+ rb->overwrite = 0;
+ else
+ rb->overwrite = 1;

atomic_set(&rb->refcount, 1);


2013-03-19 12:35:16

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2] perf: fix ring_buffer perf_output_space() boundary calculation

On Mon, 2013-03-18 at 14:33 +0100, Stephane Eranian wrote:
> This patch fixes a flaw in perf_output_space(). In case the size
> of the space needed is bigger than the actual buffer size, there
> may be situations where the function would return true (i.e., there
> is space) when it should not. head > offset due to rounding of the
> masking logic.
>
> The problem can be tested by activating BTS on Intel processors.
> A BTS record can be as big as 16 pages. The following command
> fails:
>
> $ perf record -m 4 -c 1 -e branches:u my_test_program
>
> You will get a buffer corruption with this. Perf report won't be
> able to parse the perf.data.
>
> The fix is to first check that the requested space is smaller than the
> buffer size. If so, then the masking logic will work fine. If not,
> then
> there is no chance the record can be saved and it will be gracefully
> handled
> by upper code layers.
>
> In v2, we also make the logic for the writable more explicit by
> renaming it to rb->overwrite because it tells whether or not the
> buffer can overwrite its tail (suggested by PeterZ).
>
> Signed-off-by: Stephane Eranian <[email protected]>
> ---

Acked-by: Peter Zijlstra <[email protected]>

Thanks!

Subject: [tip:perf/urgent] perf: Fix ring_buffer perf_output_space() boundary calculation

Commit-ID: dd9c086d9f507d02d5ba4d7c5eef4bb9518088b8
Gitweb: http://git.kernel.org/tip/dd9c086d9f507d02d5ba4d7c5eef4bb9518088b8
Author: Stephane Eranian <[email protected]>
AuthorDate: Mon, 18 Mar 2013 14:33:28 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 21 Mar 2013 12:04:35 +0100

perf: Fix ring_buffer perf_output_space() boundary calculation

This patch fixes a flaw in perf_output_space(). In case the size
of the space needed is bigger than the actual buffer size, there
may be situations where the function would return true (i.e.,
there is space) when it should not. head > offset due to
rounding of the masking logic.

The problem can be tested by activating BTS on Intel processors.
A BTS record can be as big as 16 pages. The following command
fails:

$ perf record -m 4 -c 1 -e branches:u my_test_program

You will get a buffer corruption with this. Perf report won't be
able to parse the perf.data.

The fix is to first check that the requested space is smaller
than the buffer size. If so, then the masking logic will work
fine. If not, then there is no chance the record can be saved
and it will be gracefully handled by upper code layers.

[ In v2, we also make the logic for the writable more explicit by
renaming it to rb->overwrite because it tells whether or not the
buffer can overwrite its tail (suggested by PeterZ). ]

Signed-off-by: Stephane Eranian <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/20130318133327.GA3056@quad
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/events/internal.h | 2 +-
kernel/events/ring_buffer.c | 22 ++++++++++++++++++----
2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index d56a64c..eb675c4 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -16,7 +16,7 @@ struct ring_buffer {
int page_order; /* allocation order */
#endif
int nr_pages; /* nr of data pages */
- int writable; /* are we writable */
+ int overwrite; /* can overwrite itself */

atomic_t poll; /* POLL_ for wakeups */

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 23cb34f..97fddb0 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -18,12 +18,24 @@
static bool perf_output_space(struct ring_buffer *rb, unsigned long tail,
unsigned long offset, unsigned long head)
{
- unsigned long mask;
+ unsigned long sz = perf_data_size(rb);
+ unsigned long mask = sz - 1;

- if (!rb->writable)
+ /*
+ * check if user-writable
+ * overwrite : over-write its own tail
+ * !overwrite: buffer possibly drops events.
+ */
+ if (rb->overwrite)
return true;

- mask = perf_data_size(rb) - 1;
+ /*
+ * verify that payload is not bigger than buffer
+ * otherwise masking logic may fail to detect
+ * the "not enough space" condition
+ */
+ if ((head - offset) > sz)
+ return false;

offset = (offset - tail) & mask;
head = (head - tail) & mask;
@@ -212,7 +224,9 @@ ring_buffer_init(struct ring_buffer *rb, long watermark, int flags)
rb->watermark = max_size / 2;

if (flags & RING_BUFFER_WRITABLE)
- rb->writable = 1;
+ rb->overwrite = 0;
+ else
+ rb->overwrite = 1;

atomic_set(&rb->refcount, 1);